Frequencies

This is a simple powershell script that can be used to get the frequency of the first letter from a sample file.

gc ‘./sample’ | %{ $_.substring(0,1) } | group

Running this over say the FTSE 100 symbol list returns:

Count Name                      Group
----- ----                      -----
   10 A                         {A, A, A, A...}
   11 B                         {B, B, B, B...}
    4 C                         {C, C, C, C}
    2 D                         {D, D}
    2 E                         {E, E}
    3 F                         {F, F, F}
    3 G                         {G, G, G}
    5 H                         {H, H, H, H...}
    8 I                         {I, I, I, I...}
    3 J                         {J, J, J}
    1 K                         {K}
    4 L                         {L, L, L, L}
    4 M                         {M, M, M, M}
    3 N                         {N, N, N}
    1 O                         {O}
    6 P                         {P, P, P, P...}
    8 R                         {R, R, R, R...}
   15 S                         {S, S, S, S...}
    2 T                         {T, T}
    2 U                         {U, U}
    1 V                         {V}
    2 W                         {W, W}

This highlights that the symbols are not uniformly spread across the alphabet.

A-F has 1/3 of the market as does P-Z

I found out this once when trying to use the ticker symbol to load balance market data across 3 servers.

Check the distribution of the data before you use a simple key.

Oddly the second letter is a better key:

    6 A                         {A, A, A, A...}
    3 B                         {B, B, B}
    4 C                         {C, C, C, C}
    5 D                         {D, D, D, D...}
    3 E                         {E, E, E}
    6 G                         {G, G, G, G...}
    4 H                         {H, H, H, H}
    3 I                         {I, I, I}
    2 K                         {K, K}
    8 L                         {L, L, L, L...}
    6 M                         {M, M, M, M...}
    7 N                         {N, N, N, N...}
    2 O                         {O, O}
    4 P                         {P, P, P, P}
    8 R                         {R, R, R, R...}
    9 S                         {S, S, S, S...}
    7 T                         {T, T, T, T...}
    2 U                         {U, U}
    6 V                         {V, V, V, V...}
    2 W                         {W, W}
    2 X                         {X, X}
    1 Z                         {Z}

	Tim Mackinnon on Formal Schemas and Property…
	Carlos Herrera on Experimenting With Elixir in…
	chriseyre2000 on Thoughts On Contentful Mi…
	Joona on Thoughts On Contentful Mi…
	What is Normal? Part… on What Is Normal Anyway?

	Tim Mackinnon on Formal Schemas and Property…
	Carlos Herrera on Experimenting With Elixir in…
	chriseyre2000 on Thoughts On Contentful Mi…
	Joona on Thoughts On Contentful Mi…
	What is Normal? Part… on What Is Normal Anyway?

Frequencies

Published by chriseyre2000

Leave a comment Cancel reply

Share this:

Related

Published by chriseyre2000

Leave a comment Cancel reply