#prog #parsing #article
Faster general parsing through context-free memoization (pdf, thanks @zukaboo)
(thanks @grammarware, @GabrielFallen)
We present a novel parsing algorithm for all context-free languages. The algorithm features a clean mathematical formulation: parsing is expressed as a series of standard operations on regular languages and relations. Parsing complexity w.r.t. input length matches the state of the art: it is worst-case cubic, quadratic for unambiguous grammars, and linear for LR-regular grammars. What distinguishes our approach is that parsing can be implemented using only immutable, acyclic data structures. We also propose a parsing optimization technique called context-free memoization. It allows handling an overwhelming majority of input symbols using a simple stack and a lookup table, similarly to the operation of a deterministic LR(1) parser. This allows our proof-of-concept implementation to outperform the best current implementations of common generalized parsing algorithms (Earley, GLR, and GLL). Tested on a large Java source corpus, parsing is 3–5 times faster, while recognition—35 times faster.
Честно скажу, я прочитал по диагонали, но фактически алгоритм выглядит, как parsing with derivatives, но сделанный правильно.
Faster general parsing through context-free memoization (pdf, thanks @zukaboo)
(thanks @grammarware, @GabrielFallen)
We present a novel parsing algorithm for all context-free languages. The algorithm features a clean mathematical formulation: parsing is expressed as a series of standard operations on regular languages and relations. Parsing complexity w.r.t. input length matches the state of the art: it is worst-case cubic, quadratic for unambiguous grammars, and linear for LR-regular grammars. What distinguishes our approach is that parsing can be implemented using only immutable, acyclic data structures. We also propose a parsing optimization technique called context-free memoization. It allows handling an overwhelming majority of input symbols using a simple stack and a lookup table, similarly to the operation of a deterministic LR(1) parser. This allows our proof-of-concept implementation to outperform the best current implementations of common generalized parsing algorithms (Earley, GLR, and GLL). Tested on a large Java source corpus, parsing is 3–5 times faster, while recognition—35 times faster.
Честно скажу, я прочитал по диагонали, но фактически алгоритм выглядит, как parsing with derivatives, но сделанный правильно.
ACM Conferences
Faster general parsing through context-free memoization | Proceedings of the 41st ACM SIGPLAN Conference on Programming Language…
Настроение: написать статью вместе с несколькими людьми под псевдонимом Norman Problem, чтобы на статью ссылались, как "N. Problem et al"
🔥10❤1👎1🤩1
Блог*
Без контекста
Мне вот интересно, кто из 136 проголосовавших реально понимает, о чём идёт речь?
Блог*
Без контекста
💩12❤1
Блог*
Понимаете, о чём идёт речь?
Те, кто понимают... Напишите в личку, пожалуйста, мне обсуждать не с кем(
😁3
Блог*
#prog #rust Поясню на практическом примере, почему нужны generic associated types. Посмотрим на трейт std::io::BufRead. Это трейт, который расширяет функционал std::io::Read функционалом, который слишком неэффективно реализовывать напрямую на методах, предоставляемых…
#prog #rust #article
Вот только на практике такой прямолинейный пример просто так не заработает. И в статье The Better Alternative to Lifetime GATs рассказывается, почему.
Вот только на практике такой прямолинейный пример просто так не заработает. И в статье The Better Alternative to Lifetime GATs рассказывается, почему.
Sabrina Jewson
The Better Alternative to Lifetime GATs
Update (2022-05-30): danielhenrymantilla recently released a crate, nougat, which provides a proc macro that allows you to use the technique presented in this article with the same syntax as regular GATs. I encourage you to check it out!
❤2
Forwarded from Jem
https://johnpublic.mataroa.blog/blog/the-asshole-test
Some years back I applied to join IBM's grad scheme, there was a peculiar stage to the process I've not seen elsewhere. It was during the onsite day, where a batch of 20 or so applicants were put through various tests in an IBM office. They called it the "group test".
Around 8 of us were led to a room and asked to solve a puzzle together. Each of us was given an information pack, there was a white board, and a timer ticking down from 60 minutes. At first there was silence as we looked at our packs, then the first voice: "Let's pool our information", someone stands up by the whiteboard, grasping a marker. Silence, it's not clear how this information should be parsed. One person starts reading theirs out word for word. This is not going to scale. Someone interrupts. Before long the whiteboard leader has been deposed and another is wielding the marker. Then another fights for control. The 60 minutes has run out, the puzzle is unsolved. Confused and drained we head off to the next test.
At the end of the day the group is split into two rooms, my room is given the good news and I go on to join the grad scheme.
Six months later I am shadowing a colleague who is running the "group test". I asked him if he'd ever seen a group complete the test? "Oh, it's not about that, this is an asshole test. You see who turns into an asshole under pressure and they don't make it to the next round".
Some years back I applied to join IBM's grad scheme, there was a peculiar stage to the process I've not seen elsewhere. It was during the onsite day, where a batch of 20 or so applicants were put through various tests in an IBM office. They called it the "group test".
Around 8 of us were led to a room and asked to solve a puzzle together. Each of us was given an information pack, there was a white board, and a timer ticking down from 60 minutes. At first there was silence as we looked at our packs, then the first voice: "Let's pool our information", someone stands up by the whiteboard, grasping a marker. Silence, it's not clear how this information should be parsed. One person starts reading theirs out word for word. This is not going to scale. Someone interrupts. Before long the whiteboard leader has been deposed and another is wielding the marker. Then another fights for control. The 60 minutes has run out, the puzzle is unsolved. Confused and drained we head off to the next test.
At the end of the day the group is split into two rooms, my room is given the good news and I go on to join the grad scheme.
Six months later I am shadowing a colleague who is running the "group test". I asked him if he'd ever seen a group complete the test? "Oh, it's not about that, this is an asshole test. You see who turns into an asshole under pressure and they don't make it to the next round".
👍16👎1
Jem
https://johnpublic.mataroa.blog/blog/the-asshole-test Some years back I applied to join IBM's grad scheme, there was a peculiar stage to the process I've not seen elsewhere. It was during the onsite day, where a batch of 20 or so applicants were put through…
Но у меня есть некоторые сомнения в этичности такого теста
#prog #rust #rustlib
В стандартной библиотеке есть тип std::ffi::CStr, предназначенный для представления невладеющих C-строк. Не смотря на то, что для определения этого типа не нужно ничего, чего не было бы в
Таким образом, хоть
В стандартной библиотеке есть тип std::ffi::CStr, предназначенный для представления невладеющих C-строк. Не смотря на то, что для определения этого типа не нужно ничего, чего не было бы в
core
, у этого типа есть методы, в сигнатурах которых фигурирует CString
, владеющая C-строка. CString
для работы требует доступа к динамическому выделению памяти, поэтому не может быть определён в core
.Таким образом, хоть
CStr
и полезен, из-за этих моментов CStr
определён в std
и потому не может быть использован в #![no_std]
программах. Это стало достаточно большой проблемой, чтобы появился крейт cstr_core, который по умолчанию зависит только от core
и подключает фичи, требующие alloc
, только по явному требованию. К счастью, в скором времени это станет ненужным, так как CStr переедет в core, а CString — в alloc.doc.rust-lang.org
CStr in std::ffi - Rust
Representation of a borrowed C string.
🔥4