HTTP 502 Bad Gateway errors in AWS ALB

TLDR: AWS ALB incorrectly returns an HTTP 502 error when HTTP headers are too large, instead of the expected HTTP 431.

Just wanted to note that we’ve discovered a small bug in AWS’ ALB. When making an HTTP/1.x request, there exist size limits that are hard limits which cannot be changed. As listed here, they are:

  • Request line: 16 K
  • Single header: 16 K
  • Whole header: 64 K

Based on RFC-6585, the error that should be returned to the client in case of having the headers exceed the limits is the 431 Request Header Fields Too Large error. The RFC states:

The 431 status code indicates that the server is unwilling to process the request because its header fields are too large. The request MAY be resubmitted after reducing the size of the request header fields.

It can be used both when the set of request header fields in total is
too large, and when a single header field is at fault. In the latter case, the response representation SHOULD specify which header field was too large.

However, AWS ALB sometimes returns HTTP 400 Request Header Or Cookie Too Large (actually, it should have at least been “Bad Request”) and sometimes HTTP 502 Bad Gateway error in such cases. We reproduced this behavior both in the browser and by using manual curl commands.

You can see more information about the investigation reaching this conclusion here

Co-founded Mayple

I am happy to announce that on August 2017 I’ve co-founded Mayple with my awesome co-founder, Omer Farkash.

I’m super proud of the product and team we’ve built here, and the positive change we’re causing for small businesses world wide. We’re fighting the pirates where ever we can, and enacting best practices in everything marketing.

More details can be found in our website. 

Using Let’s Encrypt certificates with WordPress on Amazon Lightsail

Today, I set up a WordPress instance on Amazon Lightsail. It’s a nifty little service, that allows you to very easily launch and manage a virtual private server with AWS. You can find more information about Lightsail here. Helpfully, this same article also guides you in launching a WordPress instance.

Lightsail’s WordPress instance comes with automatically-generated dummy (self-signed) SSL/TLS certificates. That means that when I try to access my website using HTTPS, I get a certificate warning. Not great.

Luckily, there’s a great complementing service called Let’s Encrypt which can help solve this issue. Let’s Encrypt is a free, automated and open certificate authority. We’ll use it to generate valid certificates for our new WordPress instance.

Follow these instructions:

  1. Get your WordPress instance running on Lightsail.
  2. Forward your domain to the instance’s public IP. For example, for the domain example.com this usually this means an A DNS record for example.com and CNAME DNS record for www.example.com to example.com.
  3. Verify that your website is accessible via HTTP and HTTPS. You’ll get a warning about the HTTPS certificate.
  4. SSH into your instance.
  5. Create a temporary directory:
    mkdir tmp
    cd tmp
  6. Install certbot as explained here:
    wget https://dl.eff.org/certbot-auto
    chmod a+x certbot-auto
  7. Create a .well-known directory in the WordPress htdocs directory:
    mkdir /home/bitnami/apps/wordpress/htdocs/.well-known
  8. Create a .htaccess file in that directory:
    touch /home/bitnami/apps/wordpress/htdocs/.well-known/.htaccess
  9. Add the following contents to the .htaccess file, to make the .well-known directory accessible:
    #
    # Override overly protective .htaccess in webroot
    #
    RewriteEngine On
    Satisfy Any

    You can edit the file using nano or vi, e.g.:

    vi /home/bitnami/apps/wordpress/htdocs/.well-known/.htaccess
  10. Run certbot. Make sure you configure everything as expected and input a real email address when required:
    ./certbot-auto certonly --webroot -w /home/bitnami/apps/wordpress/htdocs/ -d example.com -d www.example.com

    Of course, change example.com to the name of your domain.

  11. If all executes as expected, you’ll see a message congratulating you for successfully acquiring the certificates you required.
  12. Next, edit the Apache configuration file, as explained here:
    sudo vi /opt/bitnami/apache2/conf/bitnami/bitnami.conf

    Comment out (by adding a # in the beginning of the line) the following lines:

    #SSLCertificateFile "/opt/bitnami/apache2/conf/server.crt"
    #SSLCertificateKeyFile "/opt/bitnami/apache2/conf/server.key"

    Add the following lines below:

    # Let's Encrypt
    SSLCertificateFile "/etc/letsencrypt/live/example.com/fullchain.pem"
    SSLCertificateKeyFile "/etc/letsencrypt/live/example.com/privkey.pem"
    SSLCACertificateFile "/etc/letsencrypt/live/example.com/fullchain.pem"

    Of course, change example.com to the name of your domain.

  13. Finally, restart Apache:
    sudo /opt/bitnami/ctlscript.sh restart apache

    You should see the following output:

    Unmonitored apache
    Syntax OK
    /opt/bitnami/apache2/scripts/ctl.sh : httpd stopped
    Syntax OK
    /opt/bitnami/apache2/scripts/ctl.sh : httpd started at port 80
    Monitored apache
  14. Done! You can check to see whether the correct certificate appears when you access our website at http://www.example.com

Note that Let’s Encrypt certificates expire after 90 days. As explained here, you can either manually renew the certificates every 90 or so days (simply by executing steps 10 and 13), or add a Cronjob that automatically does this for you. 

Co-founded OptimalQ

I am happy to announce that this January I’ve co-founded OptimalQ with my two amazing co-founders, Yechiel Levi and Yadin Haut.

What do we do in OptimalQ? As our website says:

OptimalQ’s proprietary technology harnesses a combination of big data statistical models and real time network information to intelligently look ahead at a set of mobile numbers and, without making a call, assess the physical and mental availability of each lead.

Our availability insights result in more people answering calls when they actually have the time to talk – meaning calls will be longer and likely more productive.

It has been an amazing ride so far, especially from a technical standpoint – I’ve had the time to finally design my “dream stack” (at least for an agile, time-to-market based business) and implement most of it.

We’re using Python as our technology of choice, with DynamoDB, Redis, MySQL as our main data stores. Sensu and Prometheus help us with monitoring and alerting. Logentries is our current logging solution, but ELK is in our future. All of our stack is based on microservices, using DNS-based service-discovery and SDN. We aren’t using containers and automatic orchestration as of yet, but that is coming very soon. We’re currently AWS based, but are most of our system is completely open-source and vendor agnostic which is great, as we’ve not bound to any one cloud (and might move in the near future).

It sure has been fun, let’s hope it stay this way. :) 

go-raml/raml: An implementation of a RAML parser for Go

As part of a hackathon we had in EverythingMe, I developed an am now releasing the initial version of raml, a RAML 0.8 parser implemented in Go.

RAML is a YAML-based language that describes RESTful APIs. Together with the YAML specification, this specification provides all the information necessary to describe RESTful APIs; to create API client-code and API server-code generators; and to create API user documentation from RAML API definitions.

The raml package enables Go programs to parse RAML files and valid RAML API definitions.

You can find the project here: github.com/go-raml/raml

Update, 2016-05-06: since I am currently quite busy with running OptimalQ, and as this project is in use by several other projects and has several active forks, I’m looking for someone to take over it and make it useful once again. Message me if you’re interested. 

Won a hackathon!

Dozens of developers from all over the country attended the first Hubanana hackathon, which was held last weekend in Raanana, Israel for around 24 hours. The focus this time was iBeacon, a technology which uses Bluetooth Low Energy proximity sensing to transmit a universally unique identifier that can be picked up by any compatible device, and which can then be used to determine a device’s physical location or to trigger a location-based action, among other possibilities.

It was a fun hackathon, and my team won first prize, which was an added bonus.

My team’s product for this hackathon was called BeaconTask. The idea was simple: leave beacons around the house, in specific “task stations”. When a family member arrives at this station (kitchen, backyard, etc) he can then receive a task, which is worth points. Example tasks can be: take out the trash, wash the dishes, etc. He can then accomplish this task and take a photograph of it, and the “Manager” (Mom/Dad/roommate/boss/etc) can verify the task has been completed. A verified task awards the person who performed it with points, which can then be used to receive various prizes or rewards. (e.g. allowance for kids, days off for an office)

During the hackathon, I improved python-firebase, a wrapper for Firebase’s RESTful API. The forked version can be found here and includes caching.

I mostly worked on synchronizing the various team members, and also developed the back-end and data-model. All of the source code for our team’s project can be found here:
https://github.com/ibeacon-hackathon

Articles (in Hebrew) regarding this hackathon can be found here and here

Discovered a bug in Python 2.x/3.x

I have recently discovered a bug in Python (both in the 2.x and 3.x families) and offered
a patch to solve the issue.

When imap() or imap_unordered() are called with the iterable parameter set as a generator function, and when that generator function raises an exception, then the _task_handler thread (running the method _handle_tasks) dies immediately, without causing the other threads to stop and without reporting the exception to the main thread (that one that called imap()).

I saw this issue in Python 2.7.8, 2.7.9 and 3.4.2. Didn’t check other versions, but I assume this is a bug in all Python versions since 2.6.

I reported this bug here and attached examples that reproduce this issue, as well as patches for both Python 2.7 and Python 3.4.

The patches I attached do 2 things:

  1. A deadlock is prevented, wherein the main thread waits forever for the Pool thread/s to finish their execution, while they wait for instructions to terminate from the _task_handler thread which has died. Instead, the exception are caught and handled and termination of the pool execution is performed.
  2. The exception that was raised is caught and passed to the main thread, and is re-thrown in the context of the main thread – hence the user catch it and handle it, or – at the very least – be aware of the issue.

Now I’m waiting for a review.

Update, 2015-03-06: patch was reviewed, tests were added, and it was merged into all
Python branches. 

UTF8MB4 character set in Amazon RDS MySQL server and SQLAlchemy client

I’ve recently had to move a massive dataset that includes UTF-8 strings which contains extended set code points (i.e. planes other than the Basic Multilingual Plane, including Emoji) into a MySQL database hosted using Amazon’s RDS.

Even though all of the databases and tables were configured to use purely utf8 character set and the unicode_ci collation, and though SQLAlchemy was also configured to use UTF8, I quickly ran into issues:

Incorrect string value: '\xF0\x9D\x8C\x86' for column 'column_name' at row 1 

The solution was:

  1. Read Mathias Bynens’ awesome tutorial: The things we do to store U+01F4A9 PILE OF POO () correctly
  2. After you created a backup of your current database, change the MySQL server settings via the Parameter Groups section of the RDS console:
    • Click “Create DB Parameter Group”
    • Choose the correct Group Family (probably mysql5.6)
    • Input a group name and description (probably “mysql5.6-utf8mb4” and “MySQL 5.6 using UTF8MB4 charset by default”
    • Select this new Parameter Group in the console and click “Edit Parameters”. Set the following parameter values:
      character_set_client: utf8mb4
      character_set_database: utf8mb4
      character_set_results: utf8mb4
      character_set_connection: utf8mb4
      collation_connection: utf8mb4_unicode_ci
      collation_server: utf8mb4_unicode_ci
      character_set_server: utf8mb4

      and click “Save Changes”.

    • Go to the “Instances” dashboard, right click your RDS instance and “Modify” it, change the “Parameter Group” option to your newly created Parameter Group and click “Continue”, “Modify DB Instance”.
    • You can “Reboot” the instance if you want to be extra sure the new configuration was loaded.
  3. (Optional) Change the MySQL client settings. For the CLI mysql client, edit /etc/mysql/my.cnf and under [client] add:
    [client]
    default-character-set = utf8mb4

    This is to allow proper viewing of data using the mysql tool.

  4. Modify your existing databases, tables and columns to use UTF8MB4, as explained in the tutorial in part 1.
  5. Modify your SQLAlchemy connection string from:
    mysql+mysqldb://user:password@host:port/dbname

    to:

    mysql+mysqldb://user:password@host:port/dbname?charset=utf8mb4

    (whether or not to add use_unicode=0 is left for the programmer’s discretion.)

Enjoy.

Fork of go-yaml

As part of my work on go-raml, I needed some additional capabilities from go-yaml, so I forked it and released my own version of it (until it’s merged in, if at all, since me and the main developer of go-yaml see things a bit differently). Here’s the details

New Features:
*  Added new regexp flag: Unmarshal all encountered YAML values with keys
   that match the regular expression into the tagged field of a struct,
   which must be a map or a slice of a type that the YAML value should
   be unmarshaled into. [Unmarshaling only]
*  Now dies in case of a badly formatted YAML tag in a struct field

Bugs:
*  When a type implementing UnmarshalYAML calls the the unmarshaler func()
   to unmarshal to a specific type, which fails, followed by it calling
   the func() again with a different output value which suceeds, the YAML
   unmarshaling process still failed. Issue was d.terrs == nil check, but
   not len(d.terrs) == 0

Tests:
*  Lots of new tests for the regexp flag - regexp unmarshaling into maps,
   slices, regexp priority etc.

Here’s the fork: github.com/advance512/yaml

Redis feature: SPOP optional count argument

As per antirez‘s feature request here, I implemented the following feature:

Added parameter to SPOP:

  • spopCommand() now runs spopWithCountCommand() in case the param is found.
  • Added intsetRandomMembers() to Intset: Copies N random members from the set into inputted ‘values’ array. Uses either the Knuth or Floyd sample algos depending on ratio count/size.
  • Added setTypeRandomElements() to SET type: Returns a number of random elements from a non empty set. This is a version of setTypeRandomElement() that is modified in order to return multiple entries, using dictGetRandomKeys() and intsetRandomMembers().
  • Added tests for SPOP with : unit/type/set, unit/scripting, integration/aof

More details can be found here.

Update, 2014-12-18: merged into all Redis branches.

Updated, 2016-05-06: officially a part of Redis 3.0.2, with various parts rewritten by antirez for better performance. More info here: Redis 3.2.0 is out!