Concurrent Data Processing In Elixir (B2)

This book works you through the various concurrent programming techniques available to Elixir.

This is still in Beta, so not yet finished (and has the last few typos to be fixed), but is a great introduction.
It even covers some of the recently added features of the venerable GenServer so you get to see how to use handle_continue which is great technique that allows a genserver to return and still carry on working on a problem.

It starts with Tasks, introduces GenServers.
Then it moves onto GenStage (a system for using Backpressure to control the flow of data through an application.
Flow adds wrappers around GenStage to simplify the usage (it’s almost as easy as Streams).

The examples are mostly contained within individual chapters which makes picking up a topic much easier than other PragProg books that I have worked through.

Awaiting the B3 release!

Heisenberg And Death March Projects

A simple explanation of the Heisenberg Uncertainty Principle goes as follows:

You have a particle that you are trying to find out where it is an how fast it is moving. You are taking measurements by shining a laser onto it. If you want more accuracy on the location then you can shine a brighter laser. However this changes the velocity. If you want a more precise velocity then you need to use a dimmer laser. There is a finite limit to the combined accuracy or position and velocity.

There is a similar problem with projects that are deemed late. Frequently the solution is to add more frequent and intensive status reports. The problem is the addition of the status report takes away time that could be spent working so it will actually delay the result. In extreme cases the project will then never be finished as all time is taken up by reporting.

Learning Go

Later this week I am going to be assessing a developer on a Pair Programming TDD Programing Exercise in Go.
The problem is that I currently don’t know Go.

I do own the book Introducing Go from a Humble Bundle.

These are the notes that I have from working through the first few chapters.
I am basing the analysis upon the 7 Languages in 7 Weeks pattern.

Go is strongly typed.

Bracket style is K + R (opening bracket stays with the function declaration)

Integer size is machine (platform dependent)
Method names seem to use PascalCase.
Variable names use camelCase.

Variables can only be defined once per scope.
Variables are mutable.

Declared by var
The type follows the name if required:

var x string

I have not seen this in recent languages (this is a Pascal idea rather that the C style of type name)
Types can be implied by assigning with := (but only inside a function).

const allows the creation of things that cannot be reassigned.

Backticks allow for multiline strings.

Compiler warns about unused variables.

Variable names can be redefined at a different scope.

No implicit returns from functions. (Which is unusual for a modern language).
You can name the return variable, assign it and then call return.

You can return tuples from a function.

Does not handle overflows well.

func factorial(x uint) uint {  
    if x == 0 {  
       return 1  
    return x * factorial(x-1)

This fails on 100, and will return 0

It appears that a function cannot be overloaded.

Interfaces are implicit. You don’t need to state that you are using it.

The early versions of Go were really weak at package management. It requires you to have your code in a sub folder of the GOPATH called src. This completely insane! The language should not define the name of the folder structure.

Tests need to be named Test* and have a parameter of t *testing.T

I think I now know enough for the Pairing Interview.

Seven Databases in Seven Weeks Part 1

The project that I am working on finally got the requirements that allowed us to pick a database. Over the last few years I have been using MongoDb as the lazy choice for storage. The project makes more sense to use a relational database so I have returned to Postgres.

This has encouraged me to reread this book. I am planning on adding some details in these notes to add modern practices to the book.

To start with the book predates containers so all of the chapters require you to install the database on your machine. I am not encouraging you to use a database in a container for production. It does however allow you to test drive database code.

Over the last decade all of the big databases I have used have been cloud hosted. For a JVM based application we used hsqldb as an embedded replacement for the database in tests.

The book also fails to mention database migrations. This is a technique that allows you to version control the schema of a database.

Coming back to Postgres after years of Mongo makes me realize that you need to work a little harder to get data to and from the store. On the other hand SQL is far easier to make aggregate queries with compared to Postgres. It is easier to enforce constraints in SQL. Types, unique indexes, non-null fields and foreign keys all help.

The book also only works at the database level. I want to add examples of how to connect to the database.

The Trouble With Terraform

Recently I have been working with Terraform to stand up the infrastructure for the project that I am working on. The project involves a database plus a number of lambda functions. We have a CD build pipeline and use Terraform for the infrastructure, run from cdflow.

Terraform is great for defining the infrastructure. The problem comes when you rename something (module names in particular). Terraform will attempt to tear down the service and recreate it. Sometimes AWS services are eventually consistent. This can mean that a deleted resource hangs around for a while after being deleted. A rename of a module will delete and recreate the item, which will frequently fail on the first pass.

You also need to be very careful that you only build to a given environment from a single branch of the build pipeline. Not doing so allows databases to be torn down. I have seen an incident where a developer comments out infra that is not needed for the current build, only for that change to delete the production database and all the backups. There are things that can be done to prevent this (marking backups as requiring an extra switch to be removed). These changes then leave Terraform unable to completely clean up.

I am not arguing against configuration as code, merely noting that you will get a lot of failed builds. Some of these are resolved by rerunning the job/pipeline. Others require the resource to be manually deleted.

Terraform operates at a level of abstraction that can both help and hinder. Some of the abstractions are a little weak (these will improve in time) especially when defining IAM permissions as you end up with inline strings that explicitly define the policy.

My Elixir Education in a Series of Books

I have been collecting a small library of Elixir books.
This is an introduction to the breadth of the topics covered by Elixir.

A lot of these are tutorial style books that need to be worked through to get the benefits.

My first introduction to Elixir was:

This is Seven More Languages In 7 Weeks which covers a range of languages.
This gave a quick overview of a lot of the language.

Next up was the general introduction book:

Introducing Elixir

This is Introducing Elixir (there is a second version) a fairly straight conversion of Introducing Elixir. This is a gentle introduction to the language.

Next was an earlier version of:

Programming Elixir

This is Programming Elixir (I read one of the earlier editions). This is an in-depth exploration of the language.

Next was

Designing for scalability with Erlang/OTP

This is Designing for Scalability with Erlang/OTP. This goes into more depth on the OTP and how to design for scale. It took me several attempts to work through this.

Next was:

Programming Phoenix

This is Programming Phoenix. I have so far made three attempts to work through this one, the first was in the beta, the second when it was finished (I got distracted) and again recently.

This is one that I am still trying to find the time to read:

Craft GraphQL APIs in Elixir with Absinthe

Each time this one makes it to the top of the list I keep finding another book to read ahead of it.

Metaprogramming Elixir

This is Metaprogramming Elixir which gives a deeper understanding of when to use macros (and when not to).

I won a copy of this on a twitter competition:

Phoenix for Rails Developers

This is Phoenix for Rails Developers another more gentle introduction to Phoenix. The contrast with Rails is illustrating, pointing out pain points that Phoenix solves.

The next book is less about the code and more about how to get a project to use Elixir:

Adopting Elixir

This is Adopting Elixir. This covers some case studies of Elixir being used in production environments.

This is another book that explains how to design with Elixir

Functional Web Development, with Elixir, OTP and Phoenix

This is Functional Web Development with Elixir, OTP and Phoenix.

The next one would make a great second book for Elixir:

Designing Elixir Systems with OTP

This is Designing Elixir Systems with OTP. The approach of building Fun Things, with Big, Loud Worker Bees is a great project structuring approach. It explains the layers that should be used to design a great application.

This is another that I have not yet finished reading, but do get a lot out of:

Learn You Some Erlang for Great Good!

This is Learn You Some Erlang for Great Good! It’s a huge book and covers a lot of details about working with Erlang. I have been meaning to create a repo converting the examples in this book into Elixir.

By the same author is:

Property-Based Testing with PropEr, Erlang, and Elixir

This is Property-Based Testing with PropEr, Erlang and Elixir. The book is more biased towards Erlang. However the ideas in it have changed how I unit test things.

Another one of the books that I have not yet finished (I did buy it in beta):

Real-Time Phoenix

This is Real-Time Phoenix. It covers the soft real-time features of Phoenix.

Another one that I recently finished reading:

Genetic Algorithms in Elixir

This is Genetic Algorithms in Elixir. This covers a topic that you would not naturally associate with Elixir. It makes a good case of why Elixir is very good at it (parallel execution can speed these up).

This is another one that I started working through in beta, and have not yet returned to:

Testing Elixir

This is Testing Elixir. It goes into depth about how to get the most out of ExUnit.

This covers one of the tools that is used heavily by Phoenix.

Programming Ecto

This is Programming Ecto. It covers the database interaction code in more detail than the other books. I like that Ecto provides both Migrations and the data access abstractions.

This is my most recent purchase:

Concurrent Data Processing in Elixir

This is Concurrent Data Processing in Elixir. It seems to cover the tools needed for large scale data processing. Not yet started on this one.

The last one in this list is also as yet unread.

Modern CSS with Tailwind

This is Modern CSS with Tailwind. Technically it is not an Elixir book, but does form part of the PETAL stack (Phoenix, Elixir, Tailwind, Alpine, Liveview). I do plan to create an unofficial repo with the examples for this in Phoenix.

Why Dependabot Works

This is the theory of why dependabot works so well.

With atomic changes you know what triggered the break (or find that your tests are unreliable, which is also valuable).

Genetic Algorithms In Elixir

This weekend I have been working through the published version of this book.

The ideas are great, but there are too many errors in the examples to make it easy to work through.

The book builds up a framework for using Genetic Algorithms, including some sophisticated logging, visualisation and performance tuning tricks. Some of the example code used clearly belongs with earlier drafts. I am sure if you download the sample code you could fix the mistakes. For example some of the logic in the tiger example is wrong in the first instance, but correct when used later on. Also I don’t think that you can pass an anonymous function to apply.

Here is the repo that I have been working on:

Identity Theft

Recently I have been the victim of Identity Theft.

In October and November last year there were a total of 57 attempt to take out financial services in my name. So far two of these have succeeded in obtaining money.

Both of them were payday loan companies. I have reported the identity theft to Action Fraud. Both companies would openly talk to me about the loans and have recorded them as fraudulent.

I have signed up to a credit file checking service to allow me to see what the current state is. I will need to keep this up for a couple of months.

However neither of them have made any attempt to check that I am who I say I am. I have been sent the pdf bank statement used as part of the identification by one of the companies. Other than giving them publicly available information about myself (name, date of birth and address) no attempt has been made to validate who I am.

It can be hard to get a company to talk to you about fraud. Firms should really have an email address that can be used to report this. For example one of the finance checks went to Sky Mobile. I cannot find a means of talking to them without becoming a customer.

GDPR requests seem to be the only way to get a company to talk to them.

Programming Phoenix Chapter 12

This chapter adds an OTP application to the demo.

This is the first chapter where I have found some typos
info_sys/application.ex needs the following to work:

alias InfoSys.Counter

Without that the app won’t start.

You also need to be careful with the examples. The name of the files are sometimes incorrect (although the listed path is right). Sometimes it asks you to edit a file in Rumbl when it actually means InfoSys.

There are also a lot of mistakes in the supervisor demos. Sometimes aliases are missed (as above) and sometimes it refers to the wrong project.