Announcement

Collapse
No announcement yet.

foldy.py - improved constant folding for SNAPpy

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • foldy.py - improved constant folding for SNAPpy

    Since version 2.6, Python has done basic "constant folding" - the replacement of expressions that involve only constant values with their resulting value, rather than recalculating them every time the code is run. This is an even more valuable feature in a resource-limited environment like SNAPpy, but unfortunately Python doesn't go quite as far as one might hope:

    * The only things recognized as constants for this purpose are literal values. Global variables that are never reassigned are considered as constants by SNAPpy (the 64 global variable limit would be unbearably restrictive otherwise), but Python doesn't apply constant folding to them. (Python really has no choice here, because it can never prove that a global variable cannot be reassigned. SNAPpy lacks the dynamic mechanisms that would make indirect reassignment possible.)

    * Only operators applied to constants get folded - there's no possibility of folding a function call with constant inputs, even if it has no side-effects. (Again, Python has no choice - it can never be sure that the function apparently being called will be the one actually found at runtime).

    These limitations were severely bloating the code on a current project of mine, so I wrote a little post-processor that rewrites the bytecode of your functions to apply various optimizations. To use it:

    1. Download the attached files: foldy.py (my code) and byteplay.py (a dependency, not written by me), and place them in your snappyImages folder.

    2. At the top of your main script, and the top of each imported file that needs to use any of the function decorators described here, place the line:
    from foldy import *

    3. At the very bottom of your main script, add the line:
    fold_constants(globals(), False)
    (change False to True to get verbose descriptions of the optimizations being performed in the event log)

    With just those changes, you get the following benefits:
    * Python's constant folding is extended to handle globals that are never reassigned, and the five built-in functions for which folding makes sense (int, len, ord, str, c_h_r).
    * Functions with names starting with an underscore, that aren't actually called in your script, are deleted. There's no point in including them, since RPCs (or otherwise calling a function by name) cannot access them.

    You can request further optimization by placing one of the following decorators above your function definitions:

    @constfunc
    Makes calls to this function, in which all parameters are themselves constant, eligible for constant folding. You are asserting that the function has no side-effects other than its return value, and that it uses only functions that will work in both the Python and SNAPpy environments (since it will actually be called at script compile time). Furthermore, the function itself will be deleted if there are no remaining runtime calls to it after optimization (which includes the possibility of there being no calls to it in the first place).

    @constfunc_keep
    Makes the function eligible for constant folding as described above, but unconditionally keeps it in the final script so that you can RPC it.

    @nokeep
    Makes the function eligible for deletion if there are no calls to it. This would be useful for library functions that aren't of interest to all users of the library. Note that functions with names starting with an underscore are automatically @nokeep.

    @consolidate
    This can be applied to a function with a single string parameter, that returns nothing. Consecutive calls to this function, with the strings all being (possibly folded) constants, get combined into a single call with the concatenation of those strings (as long as that wouldn't put it over the 255 byte limit on SNAPpy string constants). This met a specific need I had, and is likely not useful to you.

    A bit of background: this was inspired by a project involving a library for talking to a LCD display, that had dozens of little functions that take the parameters for a specific display function and build the actual binary command string to be sent via SPI. Here's one of the more complicated ones:
    Code:
    def Vertex2ii(x, y, handle, cell):
        return c_h_r(cell | ((handle << 7) & 0xFF)) + c_h_r(handle >> 1 | ((y << 4) & 0xFF)) + \
            c_h_r(y >> 4 | ((x << 5) & 0xFF)) + c_h_r(x >> 3 | 0x80)
    (As usual, c_h_r substituted for The Function Which Shall Not Be Named)

    These functions are quite commonly called with constant parameters, so all of that bit-thrashing simply ends up producing a 4-character constant string. Writing that constant directly in the script would be vastly more efficient, but of course it would make the script incomprehensible and uneditable. With foldy.py, I get the best of both worlds - applying @constfunc to these functions, and @consolidate to the function that actually sends the commands to the LCD, I can have all of the parameters written out in the source code, yet the script image will contain nothing more than the precompiled command. On a test script using this library, that nearly filled a RF100 module, I saved about 6% script space from the constant folding alone, 31% including the removal of unneeded functions.

    Please note that this is deep magic, relies on undocumented internal details of Portal, and has had about one day of testing at this point. If you use it, consider the possibility that your script might not be doing exactly what you wrote anymore.
    Attached Files
    Last edited by jasonharper; 10-04-2016, 03:10 PM. Reason: version 1.2

  • #2
    This is awesome

    Comment


    • #3
      Jason,

      Thank you for another great set of tips and tricks to help.
      Your previous dumpvars is invaluable to me.

      JC

      Comment


      • #4
        Jason,

        any chance of an update for Portal 2.6?
        at the moment I get this error:
        An error occurred on line 361 while creating the SNAPpy image:tuple index out of range

        this references the byteplay.py file.

        Comment


        • #5
          foldy.py in message #1 has been updated to fix the problem with Portal 2.6 (byteplay.py did not need to be changed).

          The issue was that Portal is now being built with different options or a different version of PyInstaller, with the result that reload() no longer works on built-in modules.

          Comment


          • #6
            Originally posted by jasonharper View Post
            foldy.py in message #1 has been updated to fix the problem with Portal 2.6 (byteplay.py did not need to be changed).

            The issue was that Portal is now being built with different options or a different version of PyInstaller, with the result that reload() no longer works on built-in modules.
            Thank you!

            Comment


            • #7
              foldy.py in the first message has been updated to version 1.2 (byteplay.py has not changed), implementing a couple of experimental features:

              1. Bytecode manipulation that allows Python list comprehensions to mostly work in SNAPpy - only minor changes were required, I guess Synapse didn't realize just how close they were to having this work. In addition to requiring 2.6 firmware (there was no such thing as a list, previously), there are two known limitations due to the shortcuts Synapse took in implementing loop iterators:
              A. List comprehensions do not work in the body of a for loop - to be specific, the listcomp works just fine, but its iterator overwrites the loop's iterator, so the for loop will immediately exit afterwards. Note that a list comprehension works just fine inside a while loop, even if that's inside a for, so you can always introduce a dummy loop to avoid this problem
              B. Nested list comprehensions (those with multiple for clauses) do not work. This is not fixable, as there's no way to interpose a while loop between the clauses.

              Some examples:
              Code:
              def test5():
                  cons = [c for c in xrange(65, 91) if chr(c) not in "AEIOU"]
                  print chr(cons)
                  
              def test6():
                  for i in xrange(5, 26, 5):
                      res = [c for c in xrange(65, 65+i)] + \
                             [c for c in xrange(97, 97+i)]
                      print chr(res)
              
              def test7():
                  for i in xrange(5, 26, 5):
                      while True:
                          res = [c for c in xrange(65, 65+i)] + \
                                 [c for c in xrange(97, 97+i)]
                          print chr(res)
                          break
                      
              def test8():
                  print [i*10 + j for i in xrange(1,4) for j in xrange(1,4)]
                      
              def test9():
                  s = "blah"
                  print [ord(c) for c in s]
                      
              from foldy import *
              fold_constants(globals())
              test5() prints the consonants: "BCDFGHJKLMNPQRSTVWXYZ".

              test6() should print:
              ABCDEabcde
              ABCDEFGHIJabcdefghij
              ABCDEFGHIJKLMNOabcdefghijklmno
              ABCDEFGHIJKLMNOPQRSTabcdefghijklmnopqrst
              ABCDEFGHIJKLMNOPQRSTUVWXYabcdefghijklmnopqrstuvwxy
              ...but ends up printing only the first line of that, due to problem A mentioned above.

              test7() adds a while loop (that never actually loops) around the listcomp, and therefore produces the full output.

              test8() fails due to problem B (3 values printed instead of 9), and is not fixable.

              test9() prints "[98, 108, 97, 104]", and is an easy way to convert a string to a byte list. The list->str conversion is easily done via chr() (you'd need to do this to pass a byte list as a RPC parameter), but the opposite conversion was curiously missing.

              2. Probably nobody but me will ever need this, but I also added the ability to insert arbitrary Python bytecode in a function, for easy experimentation. This is done by writing a 2-element list as a statement, consisting of the opcode name and its argument (None for opcodes that don't use an argument). There's no reason you'd ever want to write such a statement normally, so this shouldn't conflict with anything. Example:
              Code:
              def test10():
                  x = 2
                  y = 3
                  ["LOAD_FAST", "x"]
                  ["LOAD_FAST", "y"]
                  ["BINARY_ADD", None]
                  ["PRINT_ITEM", None]
              This prints "5". Note that there is currently no way of defining or referencing labels within the code, so none of the flow control opcodes are usable.

              Comment

              X