Comments By: Peter Goodman
- posted on Mar 31, 2011 at 2:53pm | context
-
posted on Sep 11, 2010 at 6:20pm | context
For those interested, a good example of a language meant to describe the structure of another language but that cannot describe itself would be regular expressions.
A language describing all regular languages contains all regular expressions; however, the language of all regular expressions is not regular (it is context-free, to see this, consider that regular expressions cannot be used to determine whether or not a string has a balanced number of parentheses).
-
posted on Sep 11, 2010 at 5:45pm | context
I recently posted an article on the basic algorithm of PEGs (without followed-by and not-followed-by operators) in my most recent blog post titled PEGs and Left Recursion.
The algorithms I have written on the subject interprets a grammar data structure. The interpretation is similar to how a recursive descent parser works, with the exception that the algorithms caches the results of each time a non-terminal of the grammar is interpreted at a particular point in the input stream.
The algorithm written about in the post roughly corresponds to the workings of this implementation of PEGs without the aforementioned operators.
In the past, I've found it easier to implement PEG using this formulation to gain power equivalent to the aforementioned operators.
If you have further questions on this subject, then feel free to ask them in the comments section of my most recent blog post about PEGs.
-
I think I should simply put a message on this blog post for people to ignore it. I thought it was a clever hack at the time of writing it; however, in retrospect it is terrible advice.posted on Jul 6, 2009 at 3:00pm | context
-
You are correct in that it doesn't necessarily follow that language X can describe itself. For example, if language X is inherently ambiguous and the language or the method used to parse languages according to its rules do not allow for ambiguity then language X simply won't suffice.posted on Jun 20, 2009 at 7:47pm | context
Another problem relating to the aforementioned ambiguity problem would be if language X's description were to require context-sensitive things, where a word in language X means one thing in one context and another in a different context. If language X is a context-free language then my conclusion does not follow.
The main motivation of the post was to elicit excitement about parsing. I like the description I gave because it is high level enough (it ignores all thoughts about ambiguity, context, recursion, etc) to expose an interesting idea and because the idea itself is closed box (insofar as we don't get stuck in an infinite regress of one thing describing itself describing itself describing ...).
I don't think that I follow your thought in saying this sounds like set theory; could you please elaborate? It's more precise to say that language X can describe a subset of all languages, and if language X itself is within that subset then language X can describe itself; however, I don't feel like this is what you were alluding to. -
I think that PHP should keep the idea of unrecoverable error for parse errors, and in this case, they should try to continue parsing a file to find as many parse errors as they can.posted on Mar 19, 2009 at 4:57pm | context
I agree that for most other things, using exceptions as the main method for errors would be nice; however, I would add one thing, and this is somewhat inspired by the way Java lays things out (PHP would also current, or might actually, support this).
Most errors would actually extend Error, which implements Throwable, whereas most user-errors and other recoverable errors would extend Exception, which itself also extends Throwable. This distinction between Error and Exception would allow for PHP to internally throw errors, which usually are not caught (as it would mean catching all Throwables).
This would allow for current things to work as usual, as most people wouldn't catch Errors. -
Emilio, case in point, if usage of empty() without a variable parameter causes a parse error then it means that empty() is a special case within the parser, which is terrible. empty() should be no different from every other function in the PHP standard library.posted on Feb 21, 2009 at 7:30pm | context
-
You're right, it is cool. One dislike about php I have that I mentioned is that there is no actual way to reference or return a function. call_user_func is roughly equivalent to doing: $func = 'a'; $func('hello world'); As shown, $func is unfortunately not a reference to the function "a" itself but rather is a string representation of it. When you say passes a function by reference, do you mean in object context when passing array($this, 'func_name') as the first parameter?posted on Dec 29, 2008 at 12:43am | context
-
I'm not (primarily) a Java developer... I started with PHP, and have focused most of my learning on it.posted on Dec 29, 2008 at 12:40am | context
-
Great advice.posted on Dec 29, 2008 at 12:07am | context
-
The main idea of the yield control except (since renamed to MetaResponse) is to change the resource entirely--as though a new response is being made. With that in mind, it made sense to let it bubble up entirely in order to properly escape the current response and then change the application to a new one.posted on Nov 19, 2008 at 8:40pm | context
To that end, a 404 or 500 is really just a different response sent to the server. 404 and other http-related errors don't make sense as normal exception, i.e. there is no way to 'handle' a 404 if it is caught other than to display something. With this in mind, the meta response for errors is really just the second step, e.g:
either an error occured and an exception was caught, then we throw a 500 meta response if we can't handle it, or if we know there's nothing we can do to recover from the error we can throw a 500 meta response right away. It's purpose isn't to solve any problems, merely change the response. -
Lewis: Because the process of separating the returned columns can be expensive, I add the following into the query so I can quickly identify if a PQL was in fact compiled with PQL:posted on Aug 21, 2008 at 4:31pm | context
SELECT 1 AS __pql__, 1 AS __first_table, .. prefixed cols ..., 1 AS __second_table, ... prefixed calls, etc.
Then I can do a simple isset($row['__pql__']) -
Thanks for that. It is among many other things (the way packages are structured, needless complexity in the db package, etc) I need to think over.posted on Aug 7, 2008 at 5:34am | context
-
Oh neat, I was thinking of doing something similar (because python has it too) but I was more thinking of evaluating the tests within a function itself. I realized it might not be the best idea (because my approach would lead to an infinite loop), and I had already written some of the code to move the *'s and such from the doc block so I just went from there.posted on Jul 2, 2008 at 5:23pm | context
-
Yes, sorry, I manually moderate comments (as well as let akismet do some work).posted on Jun 21, 2008 at 6:36am | context
-
Thanks! I wouldn't have spotted that. It's now been fixed in subversion. I think it's clear that subconsciously I want to be in space ;)posted on Jun 15, 2008 at 7:41pm | context
-
The length of the code is a concern of mine. For the basic model building I find it's quite compact, but for queries it still takes a lot to do certain things. Here is my reasoning behind it thus far, aside from it being fun:posted on Jun 8, 2008 at 4:08pm | context
- Queries are unambiguous.
- You can focus more on what you want to get at rather than how you want to get there.
- The language is portable to other domains (as I mentioned, my intention is to make it work for XML).
- As you might have found on my blog, I developed an ORM last year, and although I am very happy with it, it does have its flaws. For some things you just want to write a query because it can be ugly fitting it into the model, and with my previous system, one needed to get a record of something from the database in order to access what that model was related to, instead of being able to do all the relations right from the start. So, first, I wanted an SQL-like syntax that is entirely divorced from the database. Second, I wanted it to be restricted to the common uses, such that switching to actual SQL is easy.
- I wanted to make an ORM entirely divorced from the database.
- Figuring out the join order isn't cheap. I need to create a dependency graph and then recursively build joins out of it. (this only happens in the compile() function call). That means at some point for it to be decent, I would need to be able to precompile queries (to sql) and store them somewhere. Segue: One thing I was thinking, though, is that for compilation, one would need to provide a unique identifier for a query. This same identifier could be seamlessly used with something like memcached as well.
-
An interesting thing I've dreamed of is that PHP would make variables names not require the $. That is, both variable names and function names would be recognized in the same symbol table, thus allowing for the easy mixing of functions and variables that you will notice in languages such as Javascript that consider functions as first-class entities. To do this, it would seem that the PHP team would only have to make variables not require the $ as a prefix to their name, and simply add the $ to the allowed characters of a symbol. Technically nothing, in terms of symbol overlap, would happen error-wise, so this could be a fun change. But, I also don't expect them to do this, which is unfortunate.posted on Apr 22, 2008 at 5:48am | context
-
Closures are both a convenience and an extremely powerful tool. Generally in PHP, there is the unfortunate mindset that everything that can be done with a closure can simply be done with a predefined function and then we can just pass the name of that function around. A common example of this would be any PHP function that accepts a callback: it takes the string name of a function and then applies it to something. In the most basic sense, we could instead create a function on-the-fly and pass it to that original function; however, clearly this doesn't add much value to closures. Instead, lets consider closures and their scope. In Javascript, if I define a variable outside of the function, then that function can access that variable. I can keep nesting functions and that (global) variable will still be accessible to the innermost depths. PHP, on the other hand, has a different way of handling globals. Best practice has global variables as being explicit, i.e. you need to use $GLOBALS or the "global" keyword. In its current state, we can observe that PHP's global keyword seems to look up to the parent scope, for example: $var = 123; function foo() { global $var; // $var is now accessible from within foo() } If this tradition were continued and anonymous functions were introduced, then a bit of annoying trampolining, as it were, would need to be done to have the same effect: $var = 123; function foo() { global $var; $anon = function() { global $var; // yay! $var is accessible. } } And so we see that in making use of one of trying to propagate variables down into lower lexical closures, we go against best practices, i.e. not to use global variables, and end up with some messy code. Clearly, in its current state, closures won't yield the nice benefit of scoped variables that we've come to expect from Javascript. So, what can we still rely on? First, lets look at the current state of PHP functions: function foo() { function bar() { // ... } } This is legal PHP, but it doesn't have the effect one might expect. Instead of bar() being local to foo, and only existing within foo, when foo is called bar becomes a globally accessible function. In this sense, PHP has failed to encapsulate bar within foo. One would expect this to change with the introduction of lexical closures. Thus far I've really made no argument for lexical closures/anonymous functions in PHP besides a minor fix to the way it encapsulates functions. Given this, WHY do I think adding closures to PHP is a good idea? In general, I think it would represent a shift in mindset and workings of PHP. Adding closures would require internal changes to how PHP works, unless closures just end up being syntactic sugar for create_function. The changes I'm talking about would mean that PHP would need to start recognizing functions in an entirely different way--that is, they would be first class, scoping would work in the way one would expect it to work in a language like javascript, etc. Almost all of this represents a dream for a better PHP, and most of it isn't even practical given how PHP works. And so, I will answer the first question by stating that closures would help PHP insofar as they would require it to get out of the hole that it's dug with create_function, using strings to call functions, and other such hacks. To answer your other question, about when closures would be nice to have and applicable to real-world coding is much simpler than you think. Given that PHP has only one namespace (this changes in newer versions), we obviously cannot have one function named the same as another. Also, with many algorithms, especially ones that are most easily understood with recursion, we often need more than one function to perform the operations, and thus we end up having to create our basic function, then a few other functions with a prefix of some sort or a way to identify them. Obviously, we could go the Java way and put all these multistep algorithms into a class, but I've always found that ugly. Consider quicksort and mergesort. PHP already provides these built-in, but whatever. Both these algorithms have distinct divide and conquer steps. It would be especially convenient, and very understandable if one could only do: function quicksort(..) { function divide(...) { } function conquer(...) { } ... } function mergesort(..) { function divide(...) { } function conquer(...) { } ... } Neither of these algorithms requires the abstraction of a class, but both of them benefit from the encapsulation of closures. Within (and only within) each main function, the closures are defined and are very obviously linked to the parent functions without the need to create all global functions with prefixes. I don't feel that the above answer will convince everyone, as prefixes or different function names, etc, are simply the accepted way of doing things. I do, however, feel that closures provide the programmer with a different (and I think better) way of approaching problems--which is to give the programmer the ability to work on subproblems and not have to deal with any annoying name conflicts. I also think that the changes that closures would require of PHP would be beneficial in general. So, regardless of if this has convinced you, or just exposed some of the inherent limitations of PHP's current setup, I think it's in your best interest to explore other languages (such as Javascript, Python, Ruby, etc) that support closures and get a feel for how it changes your workflow and your approach to problems. As side note, with closures, one might also expect partial application and proper currying ability, which would just be *sweet*.posted on Apr 22, 2008 at 5:41am | context
-
Normally I do that. In fact this has no real benefit as it's simple to manipulate the level and propagate changes up the tree. The mptt implementation didn't include a level field and so I thought it would be fun to see how it could be implemented iteratively with php.posted on Mar 4, 2008 at 1:42am | context

Good suggestion. I will look into that. I'm used to Subversion and so it's been my go-to tool of choice, but I think that it's about time I switch look into other solutions.
Also, I suppose that privately hosting the subversion repository on my domains isn't the most welcoming to possible future contributors!