Co-founded OptimalQ

I am happy to announce that this January I’ve co-founded OptimalQ with my two amazing co-founders, Yechiel Levi and Yadin Haut.

OptimalQ Logo

What do we do in OptimalQ? As our website says:

OptimalQ’s proprietary technology harnesses a combination of big data statistical models and real time network information to intelligently look ahead at a set of mobile numbers and, without making a call, assess the physical and mental availability of each lead.

Our availability insights result in more people answering calls when they actually have the time to talk – meaning calls will be longer and likely more productive.

It has been an amazing ride so far, especially from a technical standpoint – I’ve had the time to finally design my “dream stack” (at least for an agile, time-to-market based business) and implement most of it.

We’re using Python as our technology of choice, with DynamoDB, Redis, MySQL as our main data stores. Sensu and Prometheus help us with monitoring and alerting. Logentries is our current logging solution, but ELK is in our future. All of our stack is based on microservices, using DNS-based service-discovery and SDN. We aren’t using containers and automatic orchestration as of yet, but that is coming very soon. We’re currently AWS based, but are most of our system is completely open-source and vendor agnostic which is great, as we’ve not bound to any one cloud (and might move in the near future).

It sure has been fun, let’s hope it stay this way. :)

go-raml/raml: An implementation of a RAML parser for Go

As part of a hackathon we had in EverythingMe, I developed an am now releasing the initial version of raml, a RAML 0.8 parser implemented in Go.

RAML is a YAML-based language that describes RESTful APIs. Together with the YAML specification, this specification provides all the information necessary to describe RESTful APIs; to create API client-code and API server-code generators; and to create API user documentation from RAML API definitions.

The raml package enables Go programs to parse RAML files and valid RAML API definitions.

You can find the project here: github.com/go-raml/raml

Update, 2016-05-06: since I am currently quite busy with running OptimalQ, and as this project is in use by several other projects and has several active forks, I’m looking for someone to take over it and make it useful once again. Message me if you’re interested.

Won a hackathon!

Dozens of developers from all over the country attended the first Hubanana hackathon, which was held last weekend in Raanana, Israel for around 24 hours. The focus this time was iBeacon, a technology which uses Bluetooth Low Energy proximity sensing to transmit a universally unique identifier that can be picked up by any compatible device, and which can then be used to determine a device’s physical location or to trigger a location-based action, among other possibilities.

It was a fun hackathon, and my team won first prize, which was an added bonus.

My team’s product for this hackathon was called BeaconTask. The idea was simple: leave beacons around the house, in specific “task stations”. When a family member arrives at this station (kitchen, backyard, etc) he can then receive a task, which is worth points. Example tasks can be: take out the trash, wash the dishes, etc. He can then accomplish this task and take a photograph of it, and the “Manager” (Mom/Dad/roommate/boss/etc) can verify the task has been completed. A verified task awards the person who performed it with points, which can then be used to receive various prizes or rewards. (e.g. allowance for kids, days off for an office)

During the hackathon, I improved python-firebase, a wrapper for Firebase’s RESTful API. The forked version can be found here and includes caching.

I mostly worked on synchronizing the various team members, and also developed the back-end and data-model. All of the source code for our team’s project can be found here:
https://github.com/ibeacon-hackathon

Articles (in Hebrew) regarding this hackathon can be found here and here.

Discovered a bug in Python 2.x/3.x

I have recently discovered a bug in Python (both in the 2.x and 3.x families) and offered
a patch to solve the issue.

When imap() or imap_unordered() are called with the iterable parameter set as a generator function, and when that generator function raises an exception, then the _task_handler thread (running the method _handle_tasks) dies immediately, without causing the other threads to stop and without reporting the exception to the main thread (that one that called imap()).

I saw this issue in Python 2.7.8, 2.7.9 and 3.4.2. Didn’t check other versions, but I assume this is a bug in all Python versions since 2.6.

I reported this bug here and attached examples that reproduce this issue, as well as patches for both Python 2.7 and Python 3.4.

The patches I attached do 2 things:

  1. A deadlock is prevented, wherein the main thread waits forever for the Pool thread/s to finish their execution, while they wait for instructions to terminate from the _task_handler thread which has died. Instead, the exception are caught and handled and termination of the pool execution is performed.
  2. The exception that was raised is caught and passed to the main thread, and is re-thrown in the context of the main thread – hence the user catch it and handle it, or – at the very least – be aware of the issue.

Now I’m waiting for a review.

Update, 2015-03-06: patch was reviewed, tests were added, and it was merged into all
Python branches.

UTF8MB4 character set in Amazon RDS MySQL server and SQLAlchemy client

I’ve recently had to move a massive dataset that includes UTF-8 strings which contains extended set code points (i.e. planes other than the Basic Multilingual Plane, including Emoji) into a MySQL database hosted using Amazon’s RDS.

Even though all of the databases and tables were configured to use purely utf8 character set and the unicode_ci collation, and though SQLAlchemy was also configured to use UTF8, I quickly ran into issues:

Incorrect string value: '\xF0\x9D\x8C\x86' for column 'column_name' at row 1 

The solution was:

  1. Read Mathias Bynens’ awesome tutorial: The things we do to store U+01F4A9 PILE OF POO () correctly 
  2. After you created a backup of your current database, change the MySQL server settings via the Parameter Groups section of the RDS console:
    • Click “Create DB Parameter Group” 
    • Choose the correct Group Family (probably mysql5.6)
    • Input a group name and description (probably “mysql5.6-utf8mb4” and “MySQL 5.6 using UTF8MB4 charset by default”
    • Select this new Parameter Group in the console and click “Edit Parameters”. Set the following parameter values:
      character_set_client: utf8mb4
      character_set_database: utf8mb4
      character_set_results: utf8mb4
      character_set_connection: utf8mb4
      collation_connection: utf8mb4_unicode_ci
      collation_server: utf8mb4_unicode_ci
      character_set_server: utf8mb4

      and click “Save Changes”.

    • Go to the “Instances” dashboard, right click your RDS instance and “Modify” it, change the “Parameter Group” option to your newly created Parameter Group and click “Continue”, “Modify DB Instance”.
    • You can “Reboot” the instance if you want to be extra sure the new configuration was loaded.
  3. (Optional) Change the MySQL client settings. For the CLI mysql client, edit /etc/mysql/my.cnf and under [client] add:
    [client]
    default-character-set = utf8mb4

    This is to allow proper viewing of data using the mysql tool.

  4. Modify your existing databases, tables and columns to use UTF8MB4, as explained in the tutorial in part 1.
  5. Modify your SQLAlchemy connection string from:
    mysql+mysqldb://user:password@host:port/dbname

    to:

    mysql+mysqldb://user:password@host:port/dbname?charset=utf8mb4

    (whether or not to add use_unicode=0 is left for the programmer’s discretion.)

Enjoy.

Fork of go-yaml

As part of my work on go-raml, I needed some additional capabilities from go-yaml, so I forked it and released my own version of it (until it’s merged in, if at all, since me and the main developer of go-yaml see things a bit differently). Here’s the details

New Features:
*  Added new regexp flag: Unmarshal all encountered YAML values with keys
   that match the regular expression into the tagged field of a struct,
   which must be a map or a slice of a type that the YAML value should
   be unmarshaled into. [Unmarshaling only]
*  Now dies in case of a badly formatted YAML tag in a struct field

Bugs:
*  When a type implementing UnmarshalYAML calls the the unmarshaler func()
   to unmarshal to a specific type, which fails, followed by it calling
   the func() again with a different output value which suceeds, the YAML
   unmarshaling process still failed. Issue was d.terrs == nil check, but
   not len(d.terrs) == 0

Tests:
*  Lots of new tests for the regexp flag - regexp unmarshaling into maps,
   slices, regexp priority etc.

Here’s the fork: github.com/advance512/yaml

Redis feature: SPOP optional count argument

As per antirez‘s feature request here, I implemented the following feature:

Added parameter to SPOP:

  • spopCommand() now runs spopWithCountCommand() in case the param is found.
  • Added intsetRandomMembers() to Intset: Copies N random members from the set into inputted ‘values’ array. Uses either the Knuth or Floyd sample algos depending on ratio count/size.
  • Added setTypeRandomElements() to SET type: Returns a number of random elements from a non empty set. This is a version of setTypeRandomElement() that is modified in order to return multiple entries, using dictGetRandomKeys() and intsetRandomMembers().
  • Added tests for SPOP with : unit/type/set, unit/scripting, integration/aof

More details can be found here.

Update, 2014-12-18: merged into all Redis branches.

Updated, 2016-05-06: officially a part of Redis 3.0.2, with various parts rewritten by antirez for better performance. More info here: Redis 3.2.0 is out!

yEd / Visual Paradigm for UML bug: problem with mouse

Hey there,

Just encountered a frustrating problem trying to use yEd and Visual Paradigm for UML over Ubuntu. I am using a dual screen configuration. When either of these applications (and I am sure many other Java apps) are fullscreen, clicking the menu does not seem to work. Drag’n’drop of shapes seems to be imprecise, in fact it seems like the mouse has a sort of “offset” between the cursor and the actual mouse position.

This is solved by not having the window in fullscreen.

Here is another description of said problem: http://yed.yworks.com/support/qa/438/java-7-on-ubuntu-menu-mouse-error

Extracting WhatsApp message logs from a WhatsApp database

Just a small update.

After returning from my long trip abroad, having used WhatsApp profusely during the time there, I had lots of conversations I wanted to export to a message log and store for future, older days. (“Wow, I can’t believe she wrote me THAT! ;)”)

These chat logs included lots of media (images, audio clips and videos), which I also obviously wanted to keep. I used an Android phone, so I looked for an existing solution in the Google Play store. I found an applicaton called WhatsApp to Text which is quite nice, but fails to offer the option of exporting the media. I found no other solution in the Play store.

I then looked on-line for another solution. The team of D. Cortjens, A. Spruyt and W.F.C. Wieringa from the University of Amsterdam have published the results of a research project titled WhatsApp Database Encryption. Based on the results of this project, a Python script called WhatsApp Xtract was coded, to allow the generate of WhatsApp messages logs – this time with all media intact. Great.

Only thing was, some features weren’t working correctly. Media wasn’t automatically detected in some cases, the generated log files were humungous, the names of contacts were sometimes not displayed, there was no way to see all media files (like in the actual WhatsApp software), it supposedly was able to repair corrupt databases (and salvage what it can from them) but didn’t really, and generally it didn’t satisfy my requirements.

So, I updated it a bit:

v2.5 (updated by Alon Diamant – Mar 14, 2013)
– Improved encrypted Android database detection and decryption code
– Can now repair malformed Android databases (depends on availability of sqlite3 executable)

v2.4 (updated by Alon Diamant – Mar 06, 2013)
– Generates media index file – but crappily, we should set this up better..

v2.3 (updated by Alon Diamant – Mar 05, 2013)
– now generates separate file for each contact
– fixed file search to search for image files in days prior to date given (to fix a bug where because of timezone differences the image file exists but is not found)
– fixed message counts for contacts
– does not list contacts with 0 messages
– now writes version number in generated files
– (Android Version) displays WhatsApp name (server based) if no display name is found for a contact
– (Android Version) Supports Python 2.6

It still is medicore, and I am not happy with the way it works nor the way it is coded, but it’ll serve for now. I do feel like coding an Android application to do this properly on the phone, with well formatted output files that include all media. We’ll see.

For now, enjoy. You can also find updated in the project repository.

CLI definition language

Hey,

I’m sure you’ve had the occasion on which you developed an interesting application, and now required the ability to control it from the command line. The standard C/C++ functions for implementing various switches and input is nothing fun to work with, and the various template and composition based solutions are too complex most of the time.

Check out this interesting project:
http://www.codesynthesis.com/projects/cli/

You define the options you want in a very, very simple definition language, compile it into a C++ class file and voila: you can use it in your main() function.

Nice.