Etherium Maturity

This week I noticed 2 things:

Maturity of the ecosystem

I acquired the habit of checking the market capital volume of the 100 top cryptocurrencies. Doing it, I notice how the lifecycle of the currencies are happening. Some new births and some of them that are gone.

I have also checked some ICOs, reading the white papers, trying to understand who is behind them, if they are achieving the calendar communicated, etc.

In the road to reach the mainstream adoption of etherium, I guess there are still so many steps the environment has to take.

Official Etheirum App Store

Right now, there are some places where you can track some number of distributed applications based on Etherium. For instance: dappradar and state of the Dapps.

But they are not an official Store with some quality criteria that ensures some basis is still not there.

Apple, Google, Facebook “app stores”

If you try to publish an application on the Apple Store (that right now to me is the more high quality level one) you will notice that there are so many things that cannot be done with the purpose of protecting the end user.

Google restrictions is also high, but Facebook ones lack of control on some of the data that an application can retrieve from an end user.

What’s happening now?

Right now there are so many gambling games, pyramid schemes, and exchanges places that does not help to mainstream adoption.

Who will be able to drive the situation to the existence of an “official DApp store”?

Etherium is evolving as an ecosystem and the number of early adopters cannot be ignored.



Bitcoin and Litecoin are the global leaders in cryptocurrency, both are powered by similar technologies with the exception that Litecoin is a modified, more efficient version of Bitcoin focused on retail applications. Litecoin, as a result, is both cheaper and faster to transfer than Bitcoin but unfortunately may not be as universally accepted as Bitcoin.

Some advantages with respect Bitcoin:

  1. Faster transaction confirmation time (4x faster than BTC)
  2. Increased storage efficiency due to scrypt usage in LTC proof-of-work algorithm
  3. More coins to reward miners (84mn to be distributed total compared to 21mn BTC).

Litecoin is one of the more popular coins. They stay in the top 5 cryto currencies but during these last weeks the news about LitePay are not helping to the project (March 2018).


LitePay was announced to be released around the end of February 2018.

Then it was rescheduled for launching at the beginning of March 2018.

Now the launch has been postponed without defined date.


Another payment solution for Bitcoin and Litecoin with a roadmap, that I want to follow up how they deliver.

  • Payment infrastructure ready: 1/May/2018
  • Platform, meet payments: 10/May/2018
  • Developer API release: 17/May/2018
  • Series 1 – Integration: 30/May/2018
  • Series 2 – Integration: 25/June/2018


Thinking basketball

In the back side of the book you can read:

Behavioral economics, traffic paradoxes and other metaphors highlight this though-provoking insight into the NBA and your own thinking.

So I bought the book.

Ben Taylor Author Profile: News, Books and Speaking Inquiries

Some remarks to remind

Individual scoring is not replaced, it is redistributed.

Look at the global impact of a player, not just to the score.

Scoring blindness

A tendency to focus on a individual’s scoring while overlooking his other actions that influence the team score.

Individual players are limited in their impact (measured through WOWY (with or without you) method).

Variance, rules of thumb

  1. Low variance is “consistent”
  2. High variance is “inconsistent”
  3. The greater the variance, the larger sampple needed to make accurate conclusions.

Sample size insensitivity

A tendency to consider the given sample as sufficient for reaching a conclusion.

Winning bias

When a team wins, in order to explain why they won, we sift through memories of the positive events in the game. When a team losses, we eximine the negatives. This phenomenom is at the crux of winning bias.

Winning bias: a tendency to overrate how well an individual performed because his team won and underrate how well an individual performed because his team lost.

Winning bias creates a selection filter to find evidence that supports a particular conclusion.

Late game bias

A tendency to incorrectly weigh events as more important the later they occur in the game.

  • Good teams win early.
  • Clutch play matters little.
  • Hero ball and isolation plays are low-efficiency.
  • Good teams and good offenses don’t need to rely on a “closer” while bad clutch teams can be great NBA champions.

All of these beliefs about the importance of crunch-time, for both teams and players, come from late-game bias.

The rings fallacy

The false belief that championship rings in team sports are a relevant determiner of an individual’s performance.

Championship hindsight

The false belief that after a season ends, only the team that won was a “championship” level team.


How well a player’s skill travels to other, retaining value on successful teams. For instance assistants, passings, rim protection…

Lone star illusion

A tendency to cover-credit one player with the majority of a team’s success when there are no other all-stars on the team.


Our mental scoreboards are constructed from heuristics: a way to seek the solution of a problem through non-rigorous methods, such as by trial and error, empirical rules.

Heuristics is basically intuition, guess.

Our heuristics become crutches for our narratives. Over the years we have developed a tendency to focus on individual scoring at the expense of Global Offense or Global Defense contributions.


The book reviews different bias that are present in the way the narratives are done, simplifications are set, etc. To me the explanation of these specific bias and the way the narratives are done is the most valuable learning from the book.


Quantopian concepts

I have invested some hours learning about Quantopian environment and the basic concepts around the platform. The environment is very powerful, so I wanted to gain some basic clarity of the basis.

Quantopian platform

It consists of several linked components, where the main ones are:

  • Quantopian Research platform  is an IPython notebook used for research and data analysis during algorithm creation. It is also used for analyzing the past performance of algorithms.
  • The IDE  is used for writing algorithms and kicking off backtests using historical data.
  • Paper trading ability, so you can perform simulations using live data.
  • Alphalens is a Python package for performance analysis of alpha factors which can be used to create cross-sectional equity algorithms.
  • Alpha factors express a predictive relationship between some given set of information and future returns.
  • help:

The workflow

To maximize the use of Quantopian is important to understand how to work on the different steps to achieve your goals.

The basis are the same ones that the ones explained on this post, and that are represented by this diagram:

I did not found a diagram in any place, so I draw my own diagram.

Workflow, step by step

1.Universe Selection : define a subset of tradeable values (stocks/futures); the universe should be broad but have some degree of self similarity to enable extraction of relative value. It should also eliminate hard to trade or prohibited instruments. (Example: select companies with >10B$ revenue and dividend rate >3%).

This is done throught the object named “Pipeline”. The idea is not to limit yourself to a set of specific stocks but define a pipeline of stocks that allow you to quickly and efficiently consider many thousands of companies.

Pipeline allows you to address all companies, then filter them.

2.Single Alpha Factor Modeling:

Initially these 4 words together sounds like chinese to me. So I will try to explain as I understood it: it’s a model composed by a single factor that tries to find a result that has statistical significance (alpha factor).

Not enough?, Ok, I will try to explain some concepts.

First, I need to review some basis of statistics.

What is an alpha factor?

In statistical hypothesis testing, a result has statistical significance when it is very unlikely that it has occurred randomly. The level of significance is commonly represented by the Greek symbol α (alpha). The significance levels of 0.05, 0.01 and 0.001 are common.

What is a factor model?

A model in quantopian is composed by a set of factors; usually it should include:

  • a factor for the market,
  • one or two factors for value/pricing,
  • and maybe a factor for momentum.

Now let’s come back to Quantopian Single Alpha factor Modeling.

It is basically to define and evaluate individual expressions which rank the cross section of equities in your universe. By applying this relationship to multiple stocks we can hope to generate an alpha signal and trade off of it.

This can be done in 2 ways:

  • Manually: Hand crafted alphas. By the moment I will start with this method.
  • Deep Learning: alphas are learned directly, instead of defined by hand (Long-short term memory (LSTM), 1D convolutional nets). I will let this method for later.


  • Developing a good alpha signal is challenging (for instance: detect an earning surprise before the formal announcement based on sentiment data).
  • It’s important to have a scientific mindset when doing this exercise.

By being able to analyze your factors in IPython Notebook you can spend less time writing and running global back-tests. It also enables you to annotate assumptions and analyze potential bias.

This is in deed the main function of Alphalens python package: to surface the most relevant statistics and plots about a single alpha factor. This information can tell you if the alpha factor you found is predictive. These statistics cover:

  • Returns Analysis.
  • Information Coefficient Analysis.
  • Turnover Analysis.
  • Sector Analysis.

3.Alpha Combination: you basically combine many single alphas into a final alpha which has stronger prediction power than the best single alpha. Two examples about how to do it:

  • Classifier (E.g.: SVM, random forest).
  • Deep Learning: alphas are learned directly, instead of defined by hand (Long-short term memory (LSTM), 1D convolutional nets).

For simplification I have started just with 1 alpha factor, so I am right now skiping this step.

4.Portfolio Construction: implement a process which takes your final combined alpha and your risk model and produces a target portfolio that minimizes risk under your model. The natural steps to perform it are:

  • Define objective: what you want to maximize.
  • Define constrains: such leverage, turnover, position size…
  • Define risk model: define and calculate the set of risk factors you want to use to constrain your portfolio.

5.Execution: implement a trading process to transition the current portfolio (if any) to the target portfolio.

Trading view pros and cons

I have been testing Tradingview for a month, doing different scripts with Pine editor and performing some backtesting annotating results, and writing assumptions, parameters…

What is really nice

  • Data from whatever market you can think.
  • You can jump from one value to other, you can visually navigate at all levels with a very good time response. The way to draw and play with standard shapes is quite impresive.
  • Pine editor is really nice, the learning curve is very short and with few hours of learning you can build thousand of things.
  • The community of people sharing ideas, scripts is very useful.
  • Timing indicator adjusted to the interval you want. For instace you can define intervals as 7 minutes, 33 minutes and so on.

What I miss

  • Pine editor should enable to package functions and enable you to play with a function with different parameters in an automated way. For instace to scan the best use of parameters.
  • Pine editor should enable you to plot outside of the main screen. The context is always the value you have in the screen, I would like to have the possibility to plot outside of that context. For instace do a simple code that scans the best use of parameters on my strategy.
  • Strategy tester: the list of trades, I can only read it, I cannot work on that data (for instance to export to excel), so the analysis of results is difficult and it’s very tedious work.
  • I cannot use other values as input for my strategies. For instance, I would like to use one value of the oil industry as input to define a signal for trading a chemical company. On Pine editor I can only work with data from the current


The basis

Etherium platform for enabling the trading of sport players rights.

There are different solutions explained in the white paper, the more interesting one to me is the solution for Athletes:

GLOBATALENT will allow young players to sell part of their future incomes without having to have an everlasting mortgage on their life.

The other brilliant alternative is the engagement offered to the funs:

Fans will be able to buy future benefits of the club that
they support and at the same time are able to make
investments and receive profits.

This part of the business does not exist at this moment, so there are no numbers about the volume. The potential is huge, the sport bets are very popular right now, imagine to bet not on a specific game but around young players or consolidated players that offer part of their rights to the supporters, so they can long and short a percentage of these rights through their own tokens (Can you imagine to have a GlobaPlayerLebronJames token?).

Other revenue thread I see can come from the people who has been playing to sport games since years, there are people that play to these virtual games trading players in a season, doing a wolrdwide competition, etc. So well, in 2019 they can do it live, with real money.

Lauch calendar

  • Private Pre-Sale: before the Public Sale.
  • Primary Crowd Sale: from 16th April, 2018 to 06th May, 2018.
  • Partner on-boarding: April – June 2018
  • Platform release: October – December 2018.

Some numbers

The investment target is 42m€

The table below shows the sports market revenue.

The total spending on transfer fees by year is a key element for Globatalent. Imagine they are able to manage a 2% of 5B$. This will mean at 3% fee that the revenue would be 3m$.


  • GLOBATALENT Tokens (GBT) being security tokens will be limited to users who are accredited investors.
  • GlobaPlayer Tokens: Each player will have their own custom token.



What is overfitting?

when you are preparing a machine learning solution, you work basically with data sets that contains:

  • Data: relevant and/or important data.
  • Noise: inrelevant and/or non important data.

With this data you want to identify a trigger, a signal that responds to your target pattern you want that your code identifies.

So you start identifying a pattern and you work to improve it.

Suddenly, you improve your pattern identification so much, till a point where you will be not just using the data but your pattern is also using the noise side of the data to trigger the signal.

This phenomenon is not desired, and it is what is called overfitting.

In the picture from the left:

  • The black line represents a healthy pattern.
  • The green line represents an overfitted pattern.

Understanding machine learning

I have watched this video from @TessFerrandez: Machine Learning for Developers.

The video explains how the process of building a machine learning solution is. She explains it in plain English and with very nice examples easy to remind the concepts.

The video helped me to link a lot of technical ideas explained in the courses with a natural flow. Now all make sense to me.

When do I need a machine learning solution?

Imagine that you have this catalogue of pictures:

and that you want to identify when a picture has a muffin or a chihuahua.

The traditional way to do it is using “if / else” sentences, like for example: The results are not going to be good, why?

  • The problem is more complex than the basic questions you are doing and it requires thounsand of combinations of conditional sentences.
  • To find the right sequence of conditional sentences can take years.

Here, is when machine learning techniques can help you. At the end of the day, it’s a different approach to find a solution to a complex problem.

What are the steps to perform a machine learning solution?

The basic steps to build a machine learning solution are:

1.- Ask a sharp question:

At the end of the day, depending on the question you ask, you will use a different machine learning techniques.

What type of machine learning technique can I use? Well there are so many of them, but these are the basic ones:

  • Supervised learning : learning model based on a set of labeled examples. For instance you want to identify when there is a cat in a picture and you use a set of pictures where you know that there are indeed cats.
  • Unsupervised learning: think about the a data set of population where you use a clustering algorithm to classify the people in five different groups, but we do not say in which type of groups. For instance when we are looking for movies recommendations and suddenly there is a pattern identified by age (which initially we did not know it was a cluster of relevant data that we could cluster or classify)
  • Reinforcement : it uses the feedback to make decisions. For instance, a system that measures the temperature, then it compares with the target temperature and finally raise or lower the temperature. This reminds me to the servo-systems and fuzzy logic used at electronic level when I studied electronic engineering.

2.- Collect data

look for databanks, there are so many on the internet. For sure if you want precise trading data from a good bunch of markets and thousands of parameters, you will have to pay for it.

3.- Explore data: relevant, important and simple.

  • Relevant: determine features, define relevant features, discard irrelevant features.
  • Important: define important data.
  • Simple: it has to be simple (for instance avoid GPS coordinates, and replace by distance to a lake).

4.- Clean data:

  • Identify duplicate observations,
  • complete or discard missing data,
  • identify outliers,
  • identify and remove structural errors.

This step is a tedious process, but it also helps to understand the data.

5.- Transform features

Do things like to turn the GPS coordinates into distance to a lake,

6.- Select algorithms

the base algorithms are:

  • Linear regression,
  • decision tree,
  • naive bias
  • logistics regression
  • neural nets: basically is the combination of different layers of data and algorithms.

The more complex algorithms are composed in so many cases by the base algorithms. It could be neural nets or if we make it more complex we will build neural architectures.

As it is reflected on the table above, the election of the algorithm depends on the question we want to answer.

7.- Train the model

Apply the algorithms to the cleaned datasets and do a fine tune of the algorithm.

8.- Score the model

Test the model and evaluate how good/bad it is.

Typical metrics are:

  • Accuracy: how many of the total ones were classified correctly?
  • Precision: how many decisions were done correctly?
  • Recall: how many of the specific decision were correct?

9.- Use the answer

You did all this with a purpose, so if the solution works, use it. 🙂

In the video it’s mentioned a couple of tools:

  • Jupyter Notebook (python)
  • Azure Machine Learning Studio: the video includes a demo walking on the tool.

Some other notes:

  • Take notes about the assumptions and decisions you do on every step, as you will have to review them once you want to improve the algorithm.
  • Hyper parameters: the different algorithms have features that you can define (for instance: how many layers will have the decision tree algorithm?).
  • Bias / intercept: it refers to an error that is not represented by the rest of the model.