Tea requires boiling water (100C) to brew properly. Giving me a pot of hot water and a tea bag will not allow me to make tea.
I currently work with a project that is controlled by a configuration file.
It has become useful recently to write unit tests for the config file.
We check simple things such as duplicate and even validate keys against a public api.
Given that we are integrating with over 100 services a mistake can cause a large amount of errors. Unit tests help here.
I gave a talk last night at London Functional (hosted by Funding Circle) on Exercism.io and Elixir. During this I attempted a live coding demo.
This had been practiced earlier in the day. I make the typical mistake of trying to make a last min change without retesting. One part of the demo failed. Lesson for the future – don’t change a working demo on the day of a talk!
This looks to be useful:
I have been mentoring Elixir on Exercim.io for over a year now.
In that time I have mentored 4903 solutions across Elixir and Groovy.
This is a great way to keep your skills upto date in a language.
The students will challenge you with details that you will need to research. It’s the questions that you get that will stretch your knowledge.
You will need to explain unusual bugs and concepts to people who may not have english as their first language.
I have been working with a number of Node projects recently.
Keeping dependencies upto date is a big time sink. I use Dependabot to help with these.
Here is a utility that I have written that allows visualisation of module dependencies: https://github.com/chriseyre2000/package_compare
It loads node_modules into a Neo4j graph database.
To use this you need to install neo4j, create a database user with a password and the Erlang OTP runtime.
Here are the important details:
Once you have run
mix escript.build then you can use the following:
./package_compare path-to-the/package.json localhost neo4j_username neo4j_password
This can be run across multiple projects to compare the dependencies. Once you have loaded multiple applications you can you the simple query:
MATCH (a) RETURN a
This will allow you to find the core set of dependencies that your applications are using. If two projects have a large core then there may be a common library waiting to be extracted.
This is an example of an Elixir escript application. This takes an unusual approach with the Sips library, it uses start link itself so that the database configuration can be supplied on the command line. Normally this would be started as a dependent application and the config found from a config file.
The UK is in the middle of a program to implement Smart Motorways on various M and A roads. The intent of these is to allow the hard shoulder to be used as a normal lane except in the event of an accident whereby the variable speed limit signs will mark the lane as out of operation. This seems like a smart idea until you realise the competence of the current variable speed limits.
Variable speed limits can only work if the traffic is capable of travelling faster than the proscribed speed. They can only slow the traffic down. If the variable speed limit is set above the current speed of the road then it is wasting it’s time. This is the majority of uses of the variable speed limits. In addition these limits are kept in place far longer than the problem exists. I have frequently travelled through speed restrictions on the M25 where no broken down car or lose animal was visible. I call the problems these cause artificial traffic jams.
We have an organisation controlling the speeds on motorways that seems unable to reliably determine that a problem has been resolved. Given that this is a similar problem to detection when a breakdown has happen we will have no hard shoulders and traffic just breaking down on the roads.
My team has recently completed a migration (or retirement) of 8 Mongo Databases.
Four of them were replaced with lambda functions. (see https://devrants.blog/2019/05/15/replacing-a-mongodb-with-a-lambda/).
One of the databases had no data to migrate (the data was transitory and had no value after used).
The last three required the used of the Database Migration Service (DMS). DMS is configured to follow a database and move over any updates to the new system. This allows for minimal downtime when moving from one database to another, especially when taking backups and restoring would be prohibitive.
The DMS is very quick to use (we had a slightly slower approach as we had to use Terraform to configure it and could only start the jobs using a Jenkins task).
One flaw we found was in the error handling. One of our databases (unknown to us) contained some text fields with the null character (\u0000). This is something that MongoDB can handle that DocumentDB cannot. The migrations failed, reporting the error, but gave no clue as to where to find the problem. This was problematic as the system we are migrating has around 2 million documents.
We eventually took a brute force approach to find the problem records:
- Extract each record to a single file
- Read these files and write them to DocumentDB and delete those that worked.
Eventually we found the 40 problem records. These were deleted from the source system and manually inserted into the new.
Eventually we found other problems with DocumentDB that prevented us from using it for the final database (we moved it to another MongoDB provider).
We now have no MongoDB’s on the platform that is being decommissioned.
Most developers are used to being the technical support for friends and family. The latest incident that I have found was slow to fix.
My mother has a Windows 10 laptop that had started to show the “You are currently running a version of Windows that’s nearing the end of support. We recommend you update to the most recent version of Windows 10 now to get the latest features and security improvements” message.
She started to apply the suggested update, waited a while and the update uninstalled itself.
At this point I was called on to help.
I restarted the update process and it eventually prompted that a HP utility was no longer compatible and needed to be removed. This triggered a restart cycle.
At the next restart and update it suggested that freeavg needed to be upgraded or removed. I went for the simple option of uninstalling and downloading a fresh copy. Three reboots later the update manager repeated the same message about freeavg.
This was followed by an uninstall of freeavg, a restart and another trigger of the update to 1903. Again it asked for FreeAVG to be uninstalled (which was interesting as the add/remove program dialog did not include FreeAVG.
Eventually I found a freeavg stand alone uninstaller. This worked on the second attempt (hint: when it suggests booting into safe mode, it’s not kidding).
Another reboot/update cycle had the update working. It only took 3 hours from first update to fully upgraded.
I don’t know if this is expected behaviour for a mass market operating system. If I wrote code that required this level of hand holding then I would be expected to do the install myself.
How an end user that is not very tech savvy is expected to get this working is beyond me.
When I got home I looked to update my even older windows 10 laptop. The 1803 update failed with an error message to search for, and am now attempting to use the Windows 10 update assistant v1903 to get my machine updated.
My team develops and supports the systems that we work on. It is important to know what is normal so that it’s easy to see production problems before they get too serious.
One of the systems that I work on monitors a data source and sends emails out to our subscribers. I am being vague here so as to not breach client confidentiality. This system is a graph of 12 (mostly) micro-services. To know that this is healthy is a big undertaking. This is how we do this.
We have used our logging tool (DataDog) to capture the signals that we receive and the messages that we send. These are charted here on a one week scale:
The left is what we have detected and the right is what we send. The users are interested in different signals so the spikes will be of different shapes. We can see problems anywhere in the network using these two charts. The one on the right should be similar to the one on the left.
Gaps on the left will always be matched by gaps on the right. This allows us at a glance to see what is missing or abnormal. Extra gaps on the left are caused by breaks in the input feeds (which we will then check) gaps in the right are problems in processing the data.
We also keep an eye on the errors logged in the past 24 hours. The most frequent error normally requires investigation. Datadog provides a Patterns tool that helps here:
I typically try to fix the most frequent error each day. Here one of the feeds had been broken by a change on the other end.
Given the level of logging that we use I can’t remember the last time that I needed a debugger. Unit tests and logs solve this far quicker.