Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bars notebook - possible corrections #9

Open
gahobbsau opened this issue Sep 10, 2018 · 5 comments
Open

Bars notebook - possible corrections #9

gahobbsau opened this issue Sep 10, 2018 · 5 comments

Comments

@gahobbsau
Copy link

gahobbsau commented Sep 10, 2018

In my work running the code in the Bars notebook, I have found that the following possibly required corrections in the code.

  1. At Volume Bars, 2nd block
    At > v_bar_df = volume_bar_df(df, 'v', 'price', volume_M)
    Corrected to > v_bar_df = volume_bar_df(df, 'v', volume_M)
    ERROR message: TypeError: volume_bar_df() takes 3 positional arguments but 4 were given

  2. At Dollar Value Bars, 1st block
    def dollar_bars()
    At > t = df[column]
    Corrected to > t = df[dv_column]
    Reason: to match argument name

  3. At Dollar Value Bars, 2nd block
    At > dv_bar_df = dollar_bar_df(df, 'dv', 'price', dollar_M)
    Corrected to: dv_bar_df = dollar_bar_df(df, 'dv', dollar_M)
    ERROR message: TypeError: volume_bar_df() takes 3 positional arguments but 4 were given

  4. Initialisation block
    In my environment, at > import pandas_datareader.data as web
    After "import pandas as pd" I had to add the following line before the datareader line.
    pd.core.common.is_list_like = pd.api.types.is_list_like
    ERROR message: python datareader cannot import name 'is_list_like'
    CHANGE made as per Answer 55 at https://stackoverflow.com/questions/50394873/import-pandas-datareader-gives-importerror-cannot-import-name-is-list-like

  5. Suggestions:
    a) QtConsole: I have found it very helpful to add the following line to the Initilisation block so as to open a QtConsole in the current notebook kernel for interactive work without cluttering the notebooks.
    %qtconsole
    For example, I saved a lot of the variable datasets to csv files for inspection; also checking dtypes and the like.
    b) Variable Inspector for Notebooks: Install jupyter_contrib_nbextensions (and the jupyter_nbextensions_configurator)
    c) Switch plots between inline and interactive: In the QtConsole, switch between the two with:
    %matplotlib qt > #interactive plotting in separate window
    %matplotlib inline > #normal charts inside notebooks

  6. utils conflict
    a) In my environment, I also found that I needed to rename the file utils both in the src folder and in the Notebook.

  • The utils name must be used already elsewhere in my environment and so the line from utils_gh import cprint generated "ImportError: cannot import name 'cprint'" .
  • After I made the file name changes, not problem and cprint works as intended.

I trust that these may assist.

Thank you greatly for the sharing your implementations in the 2 notebooks. It has assisted greatly in understanding and contributes towards the possibility of applying the work from the book.

On reading discussion in some of the other comments, I am encouraged to find that I am not the only one who finds some of the notation obscure or ambiguous.

@aldebaransearch
Copy link

@gahobbsau and @BlackArbsCEO: Have you tried playing with the different runs-bars? I have problems keeping their dynamics from being very unstable; actually I also have that for the imbalance bars.

As I read the text, the expectation values for T (number of ticks in a bar), the imbalance or run lengths, can just be estimated as exponentially weighted moving averages of the exact same properties measured on previous bars. Hence E_0[T] is nothing but the exponential moving average of the number of ticks in the preceding bars. Likewise when it comes to the imbalance or runs lengths, they are just the exponentially weighted average of imbalance or run lengths from preceding bars, measured in terms of imbalance or run lengths per number of ticks. The latter is what makes them "probabilities" and is the reason it makes sense to multiply them with E_0[T], to get an estimate of what the theta should be for the next bar. It is a slightly different understanding of the text than what I think you have implemented, @BlackArbsCEO, but it could easily be me who haven't understood the text, your implementation or maybe both :)

The problem I run in to, comes in different flavors, depending on the decay I choose for the exponentially moving averages. I get the best results for very low decays, however, when suddenly a large imbalance or extreme run appears the changes on theta have a severe effect for a very long time. For rare occasions, this ends up in a strong feedback loop with bars that within 10 steps (10 bars) goes from consisting of 50-100 ticks to suddenly 200000 ticks. Have you experienced something similar?

I would like to share my code with you if you are interested, but I am not sure what is the appropriate way of doing that here. Please let me know.

@gahobbsau
Copy link
Author

gahobbsau commented Sep 16, 2018

In the supplied requirements.txt file: change sklearn>=0.19.1 to scikit-learn>=0.19.1.
sklearn generates an error when attempting to use it to setup the environment packages.

I created a clean new Conda Environment for these Adv_Fin_ML_Exercises Notebooks. ( Created using Visual Studio, though creating using Anaconda Navigator produces and environment with the same limited set of packages.)

Then pip installed the supplied requirements.txt file (found in the top folder) and conda installed Jupyter Notebook and JupyterLab packages into this environment.

I found that upon running the Initialisation block of code, the following packages were reported as missing and could be added to the requirements.txt file:
watermark>==1.6.1
ffn>==0.3.4
pathlib2>==2.3.2
tqdm>==4.25.0
missingno>==0.4.1
ipython>==6.5.0
pymc3>==3.5

After this upon running warnings were generated in relation to the theano package, with some recommended conda installs which I duly applied to the environment.
After all this I got a clean run of the Initialisation block in this new environment.

@BlackArbsCEO
Copy link
Owner

thanks guys for the updates. I'll review the proposed changes and try to incorporate them or reply within a few days.

@aldebaransearch you can create a pull request which has your notebook and scripts as it pertains to the topic. I'll review it, and assuming everything is good I'll approve the merge

@flamby
Copy link

flamby commented Mar 1, 2019

This great article on de Prado's handling of bars came last week:
Financial Machine Learning Part 0: Bars
It appears the author will add more articles on the book

@BlackArbsCEO
Copy link
Owner

@flamby thanks for sharing, the linked article looks informative and well done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants