-
Notifications
You must be signed in to change notification settings - Fork 19
/
index.html
executable file
·2556 lines (2540 loc) · 392 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<title>EVE Market Strategies – </title>
<style type="text/css">code{white-space: pre;}</style>
<style type="text/css">
div.sourceCode { overflow-x: auto; }
table.sourceCode, tr.sourceCode, td.lineNumbers, td.sourceCode {
margin: 0; padding: 0; vertical-align: baseline; border: none; }
table.sourceCode { width: 100%; line-height: 100%; }
td.lineNumbers { text-align: right; padding-right: 4px; padding-left: 4px; color: #aaaaaa; border-right: 1px solid #aaaaaa; }
td.sourceCode { padding-left: 5px; }
code > span.kw { color: #007020; font-weight: bold; } /* Keyword */
code > span.dt { color: #902000; } /* DataType */
code > span.dv { color: #40a070; } /* DecVal */
code > span.bn { color: #40a070; } /* BaseN */
code > span.fl { color: #40a070; } /* Float */
code > span.ch { color: #4070a0; } /* Char */
code > span.st { color: #4070a0; } /* String */
code > span.co { color: #60a0b0; font-style: italic; } /* Comment */
code > span.ot { color: #007020; } /* Other */
code > span.al { color: #ff0000; font-weight: bold; } /* Alert */
code > span.fu { color: #06287e; } /* Function */
code > span.er { color: #ff0000; font-weight: bold; } /* Error */
code > span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
code > span.cn { color: #880000; } /* Constant */
code > span.sc { color: #4070a0; } /* SpecialChar */
code > span.vs { color: #4070a0; } /* VerbatimString */
code > span.ss { color: #bb6688; } /* SpecialString */
code > span.im { } /* Import */
code > span.va { color: #19177c; } /* Variable */
code > span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code > span.op { color: #666666; } /* Operator */
code > span.bu { } /* BuiltIn */
code > span.ex { } /* Extension */
code > span.pp { color: #bc7a00; } /* Preprocessor */
code > span.at { color: #7d9029; } /* Attribute */
code > span.do { color: #ba2121; font-style: italic; } /* Documentation */
code > span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code > span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code > span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
</style>
<link rel="stylesheet" href="style.css">
<!--[if lt IE 9]>
<script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
<![endif]-->
<meta name="viewport" content="width=device-width">
</head>
<body>
<p>
<h1>EVE Market Strategies</h1>
<h3>Published by <a href="http://blog.orbital.enterprises/">Orbital Enterprises</a></h3>
</p>
<div style="width:500px">
<iframe src="https://ghbtns.com/github-btn.html?user=OrbitalEnterprises&repo=eve-market-strategies&type=watch&count=true" allowtransparency="true" frameborder="0" scrolling="0" width="110px" height="20px"></iframe>
<p> <br></p>
</div>
<nav id="TOC">
<ul>
<li><a href="#preface">Preface</a><ul>
<li><a href="#summary-of-chapters">Summary of Chapters</a></li>
<li><a href="#change-log">Change Log</a></li>
</ul></li>
<li><a href="#preliminaries">Preliminaries</a><ul>
<li><a href="#market-structure">Market Structure</a><ul>
<li><a href="#order-mechanics">Order Mechanics</a></li>
<li><a href="#price-formation">Price Formation</a></li>
<li><a href="#transaction-costs">Transaction Costs</a></li>
<li><a href="#information-disclosure">Information Disclosure</a></li>
<li><a href="#contract-system">Contract System</a></li>
</ul></li>
<li><a href="#market-data">Market Data</a><ul>
<li><a href="#market-history-endpoints">Market History Endpoints</a></li>
<li><a href="#order-book-data-endpoints">Order Book Data Endpoints</a></li>
<li><a href="#pricing-data-endpoints">Pricing Data Endpoints</a></li>
<li><a href="#structure-location-endpoints">Structure Location Endpoints</a></li>
<li><a href="#discovering-trade-data">Discovering Trade Data</a></li>
</ul></li>
<li><a href="#tools-used-in-this-book">Tools used in this Book</a><ul>
<li><a href="#market-data-tools">Market Data Tools</a></li>
<li><a href="#static-data-export-sde">Static Data Export (SDE)</a></li>
<li><a href="#jupyter-notebook">Jupyter Notebook</a></li>
</ul></li>
<li><a href="#example-analysis-and-calculations">Example Analysis and Calculations</a><ul>
<li><a href="#example-1---data-extraction-make-a-graph-of-market-history">Example 1 - Data Extraction: Make a Graph of Market History</a></li>
<li><a href="#example-2---book-data-compute-average-daily-spread">Example 2 - Book Data: Compute Average Daily Spread</a></li>
<li><a href="#example-3---trading-rules-build-a-buy-matching-algorithm">Example 3 - Trading Rules: Build a Buy Matching Algorithm</a></li>
<li><a href="#example-4---unpublished-data-build-a-trade-heuristic">Example 4 - Unpublished Data: Build a Trade Heuristic</a></li>
<li><a href="#example-5---important-concepts-build-a-liquidity-filter">Example 5 - Important Concepts: Build a Liquidity Filter</a></li>
<li><a href="#example-6---a-simple-strategy-cross-region-trading">Example 6 - A Simple Strategy: Cross Region Trading</a></li>
<li><a href="#setting-up-isolated-environments-with-conda">Setting up Isolated Environments with Conda</a></li>
</ul></li>
</ul></li>
<li><a href="#arbitrage">Arbitrage</a><ul>
<li><a href="#what-makes-arbitrage-possible">What Makes Arbitrage Possible?</a></li>
<li><a href="#ore-and-ice-refinement-arbitrage">Ore and Ice Refinement Arbitrage</a><ul>
<li><a href="#example-7---detecting-ore-and-ice-arbitrage-opportunities">Example 7 - Detecting Ore and Ice Arbitrage Opportunities</a></li>
<li><a href="#example-8---ore-and-ice-arbitrage-back-test">Example 8 - Ore and Ice Arbitrage Back Test</a></li>
</ul></li>
<li><a href="#scrapmetal-processing-arbitrage">Scrapmetal Processing Arbitrage</a><ul>
<li><a href="#example-9---detecting-scrapmetal-arbitrage-opportunities">Example 9 - Detecting Scrapmetal Arbitrage Opportunities</a></li>
<li><a href="#example-10---scrapmetal-arbitrage-back-test">Example 10 - Scrapmetal Arbitrage Back Test</a></li>
<li><a href="#analysis-limitations">Analysis Limitations</a></li>
</ul></li>
<li><a href="#strategy-effectiveness">Strategy Effectiveness</a></li>
<li><a href="#variants">Variants</a><ul>
<li><a href="#selling-with-limit-orders">Selling with Limit Orders</a></li>
<li><a href="#example-11---ore-and-ice-arbitrage-with-limit-orders">Example 11 - Ore and Ice Arbitrage with Limit Orders</a></li>
<li><a href="#capture-dumping-with-buy-orders">Capture “Dumping” with Buy Orders</a></li>
<li><a href="#example-12---finding-bid-targets">Example 12 - Finding Bid Targets</a></li>
</ul></li>
<li><a href="#practical-trading-tips">Practical Trading Tips</a><ul>
<li><a href="#keep-up-with-static-data-export-changes">Keep Up with Static Data Export Changes</a></li>
<li><a href="#beware-ghost-and-canceled-orders">Beware “Ghost” and Canceled Orders</a></li>
<li><a href="#use-multi-buy">Use Multi-buy</a></li>
</ul></li>
</ul></li>
<li><a href="#market-making-station-trading">Market Making (Station Trading)</a><ul>
<li><a href="#market-making-basics">Market Making Basics</a><ul>
<li><a href="#example-13---detecting-market-making">Example 13 - Detecting Market Making</a></li>
</ul></li>
<li><a href="#selecting-markets">Selecting Markets</a><ul>
<li><a href="#example-14---selecting-market-making-targets">Example 14 - Selecting Market Making Targets</a></li>
</ul></li>
<li><a href="#testing-a-market-making-strategy">Testing a Market Making Strategy</a><ul>
<li><a href="#example-15---modeling-orders-and-trades">Example 15 - Modeling Orders and Trades</a></li>
<li><a href="#example-16---testing-market-making-strategies">Example 16 - Testing Market Making Strategies</a></li>
</ul></li>
<li><a href="#strategy-effectiveness-1">Strategy Effectiveness</a></li>
<li><a href="#variants-1">Variants</a><ul>
<li><a href="#relisting">Relisting</a></li>
<li><a href="#example-17---evaluating-relisting-targets">Example 17 - Evaluating Relisting Targets</a></li>
</ul></li>
<li><a href="#practical-trading-tips-1">Practical Trading Tips</a><ul>
<li><a href="#funding-requirements">Funding Requirements</a></li>
<li><a href="#order-layering">Order Layering</a></li>
<li><a href="#pricing-tricks">Pricing Tricks</a></li>
</ul></li>
</ul></li>
</ul>
</nav>
<h1 id="preface">Preface</h1>
<p>EVE Online is unique among online games in that the in-game market has significant importance to players, regardless of the game play style they choose to pursue. Whether in high, low or null security, many (probably all) players rely on EVE’s markets to buy and equip ships, sell acquired or produced goods, buy supplies needed for various endeavors, and otherwise fill gaps in the set of things needed to pursue various play styles.</p>
<p>As the market has long been an important feature of EVE, some players have similarly focused on developing tools and strategies to support the use of markets. In the early days, these tools were intended to supplement standard play styles, for example making it easier to track market participation (e.g. positions, open orders) as part of a larger strategy (e.g. mining, production, conquest). However, as the game has grown, so has the sophistication of the markets, and in modern EVE “playing the markets” is a valid play style pursued by many players. Third party tool developers have adopted to these changes as well, with many new tools written explicitly to support profiting from the market. CCP, for their part, has responded by providing more real-time data (e.g. five minute market snapshots), and more visibility into alternative markets (e.g. citadel markets). The combination of new tools and rich data has made it possible to implement many “real life” trading strategies that were previously difficult to execute<a href="#fn1" class="footnoteRef" id="fnref1"><sup>1</sup></a>.</p>
<p>This book attempts to be a systematic discussion of data driven analysis and trading strategies in EVE’s markets. The strategies we derive here rely on public data provided by CCP. Not all of this data is market data and, in fact, more complicated strategies often rely on non-market data. Regardless, all of this data is publicly available: the strategies we describe here don’t rely on non-public or otherwise proprietary data. We derive concepts and strategies using publicly available tools, and we share our techniques through a number of examples (including source code). While the example code we present here is suitable for illustrative purposes, serious market players will likely want to write custom, optimized code for certain strategies (as the author does).</p>
<h2 id="summary-of-chapters">Summary of Chapters</h2>
<p>The remainder of the book is organized as follows:</p>
<ul>
<li><a href="#preliminaries">Chapter 1 - Preliminaries</a> This chapters describes key features of the EVE markets and introduces tools and data sources used to describe trading strategies. This chapter concludes with several examples introducing the tools.</li>
<li><a href="#arbitrage">Chapter 2 - Arbitrage</a> This chapter describes arbitrage trading which is the process of discovering undervalued market items which can be purchased and turned into profit very quickly.</li>
<li><a href="#market-making">Chapter 3 - Market Making</a> This chapter describes market making strategies, better known as station trading, which profits from market participants willing to “cross the spread” instead of waiting for better prices through limit orders.</li>
</ul>
<h2 id="change-log">Change Log</h2>
<ul>
<li>2017-06-18: Finished “Market Making” chapter.</li>
<li>2017-05-05: Found gap problem in order book data. Updated examples and text in Chapters 1 and 2 with work around.</li>
<li>2017-04-21: Finished first two chapters. Removed placeholder for Chapter 3. Public announcement on forums.</li>
<li>2017-02-12: Beyond the initial two chapters, we have four additional chapters planned:</li>
<li>“Market Making” - an analysis of market making strategies (a.k.a. statoin trading).</li>
<li>“Simulating Trading Strategies” - eventually this will describe a simple back test simulator we’re developing.</li>
<li>“Risk” - this chapter will give a more careful treatment to risk around trading (and EVE in general).</li>
<li>“Trend Trading” - this chapter will discuss trend-based trading strategies.</li>
</ul>
<h1 id="preliminaries">Preliminaries</h1>
<p>This chapter explains the mechanics of EVE markets and introduces the various sources of market data we use to develop trading strategies. We also introduce the tools we’ll use in the rest of the book to develop our strategies. We end the chapter with several examples illustrating these tools.</p>
<h2 id="market-structure">Market Structure</h2>
<p>EVE markets are governed by a set of rules which define how orders are placed, how they are matched, and what charges are imposed on the participants in a trade. Market structure in EVE is somewhat unique, but if comparisons to actual financial markets must be made, then EVE markets are most similar to a commodities spot market. Commodities markets trade physical assets (or rather, the right of ownership of assets), much the same as trading occurs in EVE markets. A “spot market” is a market in which assets are exchanged at current prices, as opposed to prices guaranteed by a contract (e.g. futures markets).</p>
<p>Many good descriptions of EVE markets are already available on the web. We recommend reading EVE University’s <a href="http://wiki.eveuniversity.org/Trading">Trading Introduction</a> for a thorough overview of the interface, important skills and basic mechanics. Rather than recount that description here, we instead provide a more academic treatment of some key aspects of EVE markets below.</p>
<h3 id="order-mechanics">Order Mechanics</h3>
<p>There are two order types in EVE markets:</p>
<ul>
<li>A <em>market order</em> is an instruction to buy or sell an asset at the current best available price.</li>
<li>A <em>limit order</em> is an instruction to buy or sell an asset at a fixed price, which may be different than the current best available price.</li>
</ul>
<p>Although this terminology is standard in financial literature, the EVE UI does not use these terms in order placement screens. Generally speaking, if you are placing an order using the “advanced” UI screen (with a duration other than “immediate”), then you are placing a limit order. Otherwise, you are placing a market order. This is true even if your limit order is matched immediately, and has important consequences in terms of trading cost (see below).</p>
<p>Orders are handled “First In, First Out” (FIFO), meaning the sequence of order arrival determines priority for matching with other orders. When a sell order is placed, the items being sold are placed in escrow at the seller’s location. If a sell order is filled, then the sold items are moved to the buyer’s hangar at the seller’s location. In other words, all transactions occur at the seller’s location. This has at least two important consequences for trading strategies:</p>
<ol type="1">
<li>If the buyer is not in the same station as the seller, then the buyer may need to transport purchased goods elsewhere; and,</li>
<li>If the seller is in a player-owned structure which is not accessible to the buyer (e.g. access controls are changed after the sale), then the buyer’s assets may become stranded.</li>
</ol>
<p>Market participation from player-owned structures is a relatively recent feature (at time of writing). However, the number of such participants is growing, making it important to consider access risk when buying goods from sellers in these structures.<a href="#fn2" class="footnoteRef" id="fnref2"><sup>2</sup></a></p>
<h3 id="price-formation">Price Formation</h3>
<p>Trade prices in EVE markets are determined by order matching rules. Unlike other financial markets, there is no facility for price negotiation or auction, although some of these features exist in secondary markets (e.g. the contract system). Sell orders in EVE have a location, price, quantity and minimum volume. Buy orders have a location, price, quantity, minimum volume and range. A pair of orders match when:</p>
<ol type="1">
<li>The location of the sell order is within range of the location specified in the buy order.</li>
<li>The sell order represents the first, lowest priced sell order within range of the buy order.</li>
<li>The buy order represents the first, highest priced buy order within range of the sell order.</li>
<li>A price exists which meets the pricing constraints (see below).</li>
</ol>
<p>The price at which the transaction occurs must satisfy the following constraints:</p>
<ol type="1">
<li>A sell limit order must match at or above the posted price on the order.</li>
<li>A buy limit order must match at or below the posted price on the order.</li>
<li>A buy or sell market order must match at the posted price on the order.</li>
<li>If the sell order has precedence (arrived at the market first), then the transaction price is the maximum price which satisfies the above constraints.</li>
<li>If the buy order has precedence, then the transaction price is the minimum price which satisfies the above constraints.</li>
</ol>
<p>The effect of precedence rules means that some care must be taken when pricing orders. For example, suppose a sell limit order is placed for 100 ISK. Suppose further that this order is currently the best available sale price (i.e. the first, lowest price available in the current location). If a buy limit order is then placed for 101 ISK, then this order will match immediately at a price of 101 ISK (<em>not</em> 100 ISK) because the sell order has precedence. A similar behavior occurs when buy limit orders have precedence.</p>
<h3 id="transaction-costs">Transaction Costs</h3>
<p>EVE markets impose two fees on orders:</p>
<ul>
<li><em>Broker Fees</em> are charged at order placement time for all limit orders (even orders which match immediately). The owner of the order is charged this fee.</li>
<li><em>Sales Tax</em> is charged each time a sell order is matched and is paid by the seller.</li>
</ul>
<p>Broker fees are charged as a percentage of the total order value, with the charge normally between 2.5% and 5% (adjusted by standing and the Broker Relations skill). Sales tax is charged as a percentage of the matched order volume multiplied by the match price, with the charge normally between 1% and 2% (adjusted by the Accounting Skill). Sales tax in player-owned structures is determined by the owner and may be lower than the normal range.</p>
<p>There are two important consequences of this fee structure:</p>
<ol type="1">
<li>Market buy orders incur no fees. Thus, it can sometimes be advantageous to simply wait for a good price instead of placing a limit buy order (which will incur a broker fee); and,</li>
<li>The fees determine the minimum return required for a profitable strategy. For example, a typical market making strategy which uses limit orders on both the buy and sell side will need to return at least 6% to be profitable (5% paid on both sides for limit orders, plus 1% sales tax on transactions).</li>
</ol>
<p>Training Broker Relations and Accounting skills is usually mandatory for serious EVE market players as these skills at max training reduce fees by a significant amount.</p>
<h3 id="information-disclosure">Information Disclosure</h3>
<p>The primary live source of information for EVE markets is the game client UI which shows a live view of the current order book for each asset, as well as historical pricing information. CCP provides third party developer APIs which expose similar information (see below), but the game client is always the most up to date view of the markets. Several important metrics are visible from the game client including the volume (total assets traded) to date, and the current spread (difference between best buy price and best sell price) between buyers and sellers. The order book for a given asset displays price, volume and location for each order, but does <em>not</em> display the names of market participants. However, the names of market participants <em>are</em> recorded and visible to each counter-party when a transaction is completed. This is a unique feature of EVE markets which is quite different from other financial markets where anonymity is preserved between counter-parties.<a href="#fn3" class="footnoteRef" id="fnref3"><sup>3</sup></a> This feature also provides an opportunity for traders to gather more information about other market participants, for example by buying or selling in small quantities to competing bids in order to discover the names of competitors.</p>
<p>Recent EVE expansions have added the ability to create market hubs in player-owned structures. These hubs are added to the market for the region they are located in, allowing orders to be placed from player-owned locations. Although market hubs are now included in regional markets, the player-owned structure itself may not be publicly accessible, or accessibility may change during the course of a day’s market events. As a result:</p>
<ul>
<li>Sell orders may appear from player-owned structures, but sold items may be inaccessible if a structure changes access after the transaction completes; and,</li>
<li>Buy orders may appear from non-public player-owned structures. The location of these structures is visible in game, but is currently <em>not</em> visible in data out of game (e.g. third party APIs).</li>
</ul>
<p>It is generally safe to sell to orders placed from player-owned structures. However, great care should be taken when buying from orders placed from player-owned structures. Missing location information in out of game data sources is mostly an annoyance right now as workarounds exist to obtain the needed information.</p>
<h3 id="contract-system">Contract System</h3>
<p>EVE has a large secondary market called the “contract system” which allows for direct player to player exchange of numerous assets, including assets which are not tradable on EVE markets (e.g. fitted ships). We don’t cover the contract system is this book, mainly because there is currently no reliable third party API to gather information about available contracts.</p>
<h2 id="market-data">Market Data</h2>
<p>As noted above, market data is visible in the game client in two separate views:</p>
<ol type="1">
<li>A view of the current order book for an asset in a region; and</li>
<li>A view of the daily high, low, and average trade prices (and related metrics), and total volume for an asset as far back as one year.</li>
</ol>
<p>The game client always shows the most recent view of market data. All other data sources (e.g. third party developer sources) lag the game client. It’s also worth noting that the game client is the only EULA-approved way to participate in the market by buying or selling assets.<a href="#fn4" class="footnoteRef" id="fnref4"><sup>4</sup></a></p>
<p>Data driven development of trading strategies is usually conducted offline, preferably using several years of market data. CCP has only recently invested effort in making market data easily accessible to third party developers. Historically, market data was gathered through crowd sourcing efforts by scraping cached data stored in the game client.<a href="#fn5" class="footnoteRef" id="fnref5"><sup>5</sup></a> Scraped data was then aggregated and made available by third party sites like <a href="https://eve-central.com/">EVE Central</a>. Asset coverage relied on having enough players interested in regularly viewing certain assets (so that the client cache would be refreshed regularly). The fact that players themselves were providing data also raised concerns about purposely modifying data to affect game play. Despite these limitations and risks, crowd sourced data was the primary source of market data for many years.</p>
<p>In 2015, CCP started providing market history, order book, and market pricing data through their <a href="https://eveonline-third-party-documentation.readthedocs.io/en/latest/crest/index.html">“CREST” API</a>. CREST is a REST-based web API which supports both public and authorized data endpoints. Authorized endpoints use the <a href="https://eveonline-third-party-documentation.readthedocs.io/en/latest/sso/index.html">EVE Single Sign-On</a> service based on OAuth access control. CREST authorization is only used for certain player or corporation related endpoints. Market data can be accessed on CREST without authentication. When market modules were released for player-owned structures (e.g. Citadels), public buy orders placed in these modules were eventually made visible in CREST market data. However, CREST provided no mechanism for resolving the location where these player-owned structures resided, making it difficult to implement the same order matching rules as provided in the game client.</p>
<p>While CREST was an important upgrade for third party developers (as compared to the <a href="https://eveonline-third-party-documentation.readthedocs.io/en/latest/xmlapi/index.html">XML API</a>), CCP significantly modernized third party APIs by releasing the <a href="https://esi.tech.ccp.is/latest/">EVE Swagger Interface (ESI)</a> which exposed a new REST-based web API documented with <a href="http://swagger.io/">Swagger</a>. Swagger documentation allows the use of many convenient tools for third party developers, such as API browsers and client generators in a variety of languages. Swagger also provides clean support for versioning and authorization, making it much easier for CCP to evolve the API over time. The ESI provides the same market data as in CREST with two important upgrades:</p>
<ol type="1">
<li>A facility for resolving the location of certain player-owned structures; and,</li>
<li>Access to order book data local to player-owned structures.</li>
</ol>
<p>The ESI uses the same OAuth based authorization scheme as CREST. At time of writing, CCP has declared ESI as the API of the future and has deprecated both CREST and the XML API. However, CCP has stated they will keep the XML API and CREST active at least until ESI reaches feature parity. Unless otherwise indicated, examples in this book which require direct access to live market data will use the ESI endpoints.</p>
<p>The remainder of this section describes the main market data ESI endpoints in more detail.</p>
<h3 id="market-history-endpoints">Market History Endpoints</h3>
<p>The market history endpoint returns a daily price summary for a given asset type in a given region. In financial modeling terms, this data is similar to daily stock market data. The market history endpoint returns all data from the start of the previous calendar year. Note that data for the current day is typically not available until approximately 0800 UTC the following day.</p>
<p>We’ll use market history to demonstrate the use of the ESI. ESI endpoints follow REST conventions and can always be accessed using the HTTP protocol. For example, the following URL will return market history for Tritanium (type 34) in Forge (region 10000002):</p>
<pre><code>https://esi.tech.ccp.is/latest/markets/10000002/history/?datasource=tranquility&type_id=34</code></pre>
<p>The result of this request will be a JSON formatted array of values of the form:</p>
<div class="sourceCode"><pre class="sourceCode json"><code class="sourceCode json"><span class="fu">{</span>
<span class="dt">"date"</span><span class="fu">:</span> <span class="st">"2015-05-01"</span><span class="fu">,</span>
<span class="dt">"average"</span><span class="fu">:</span> <span class="fl">5.25</span><span class="fu">,</span>
<span class="dt">"highest"</span><span class="fu">:</span> <span class="fl">5.27</span><span class="fu">,</span>
<span class="dt">"lowest"</span><span class="fu">:</span> <span class="fl">5.11</span><span class="fu">,</span>
<span class="dt">"order_count"</span><span class="fu">:</span> <span class="dv">2267</span><span class="fu">,</span>
<span class="dt">"volume"</span><span class="fu">:</span> <span class="dv">16276782035</span>
<span class="fu">}</span></code></pre></div>
<p>where:</p>
<ul>
<li><em>date</em> - gives the date for which the values are reported.</li>
<li><em>average</em> - gives the average trade price of the asset for the day.</li>
<li><em>highest</em> - gives the highest trade price of the asset for the day.</li>
<li><em>lowest</em> - gives the lowest trade price of the asset for the day.</li>
<li><em>order_count</em> - gives the number of trades for the day.</li>
<li><em>volume</em> - gives the number of units traded for the day.</li>
</ul>
<p>What makes the ESI special is the Swagger definition which can be retrieved at:</p>
<pre><code>https://esi.tech.ccp.is/latest/swagger.json?datasource=tranquility</code></pre>
<p>The Swagger definition is a structured description of the endpoints of ESI, including the name and type of the arguments required, the format of the result, and any authorization that may be required. Tools make use of this definition to automatically generate documentation and API clients. For example, the documentation of the above endpoint can be found <a href="https://esi.tech.ccp.is/latest/#!/Market/get_markets_region_id_history">here</a>.</p>
<p>In addition to the response body, the ESI also returns several important HTTP response headers. There are three headers you should observe:</p>
<ul>
<li><em>date</em> - this is the date at the server when the result was generated. All dates are normally UTC (i.e. “EVE” time).</li>
<li><em>last-modified</em> - this is the date the response body was last changed. The difference between <em>date</em> and <em>last-modified</em> gives the age of the data.</li>
<li><em>expires</em> - this is the date when the current response body should be refreshed. It will be unproductive to request the data again before this time.</li>
</ul>
<p>The <em>expires</em> field is important for automated collection of data, and for competitive analysis. The tools we describe later in this chapter use <em>expires</em> to drive regular downloads of the data for historical analysis. The <em>expires</em> field also tells you how frequently other market participants can see fresh data (unless they are using the game client).</p>
<h3 id="order-book-data-endpoints">Order Book Data Endpoints</h3>
<p>The market order book endpoints return a view of the current orders for a given region, optionally restricted to a given type. The ESI endpoint to retrieve this data is <a href="https://esi.tech.ccp.is/latest/#!/Market/get_markets_region_id_orders">“get market region orders”</a>. The following URL will retrieve the first page of orders for Tritanium (type 34) in Forge (region 10000002):</p>
<pre><code>https://esi.tech.ccp.is/latest/markets/10000002/orders/?datasource=tranquility&order_type=all&page=1&type_id=34</code></pre>
<p>Leaving off <code>type_id=34</code> will return all orders for all types in the given region. The result of this request will be a JSON formatted array of values of the form:</p>
<div class="sourceCode"><pre class="sourceCode json"><code class="sourceCode json"><span class="fu">{</span>
<span class="dt">"order_id"</span><span class="fu">:</span> <span class="dv">4740968511</span><span class="fu">,</span>
<span class="dt">"type_id"</span><span class="fu">:</span> <span class="dv">34</span><span class="fu">,</span>
<span class="dt">"location_id"</span><span class="fu">:</span> <span class="dv">60005599</span><span class="fu">,</span>
<span class="dt">"volume_total"</span><span class="fu">:</span> <span class="dv">1296000</span><span class="fu">,</span>
<span class="dt">"volume_remain"</span><span class="fu">:</span> <span class="dv">952089</span><span class="fu">,</span>
<span class="dt">"min_volume"</span><span class="fu">:</span> <span class="dv">1</span><span class="fu">,</span>
<span class="dt">"price"</span><span class="fu">:</span> <span class="dv">10</span><span class="fu">,</span>
<span class="dt">"is_buy_order"</span><span class="fu">:</span> <span class="kw">false</span><span class="fu">,</span>
<span class="dt">"duration"</span><span class="fu">:</span> <span class="dv">90</span><span class="fu">,</span>
<span class="dt">"issued"</span><span class="fu">:</span> <span class="st">"2017-01-06T22:29:36Z"</span><span class="fu">,</span>
<span class="dt">"range"</span><span class="fu">:</span> <span class="st">"region"</span>
<span class="fu">}</span></code></pre></div>
<p>where:</p>
<ul>
<li><em>order_id</em> - gives the unique order id for this order.</li>
<li><em>type_id</em> - gives the type of asset transacted by this order.</li>
<li><em>location_id</em> - gives the location where this order was placed. The location is important for computing possible order matches and is discussed in more detail below.</li>
<li><em>volume_total</em> - gives the total number of assets to be transacted when the order was placed.</li>
<li><em>volume_remain</em> - gives the number of assets still left to be transacted for this order.</li>
<li><em>min_volume</em> - gives the minimum number of assets which must be exchanged each time this order is matched. Note that “volume_remain” can be less than “min_volume”, in which case fewer assets may transact on the last match for the order.</li>
<li><em>price</em> - is the posted priced for the order.</li>
<li><em>is_buy_order</em> - is true if the order is a “bid” (buy), otherwise the order is an “ask” (sell).</li>
<li><em>duration</em> - is the length of time in days that the order will remain in the book until it is matched.</li>
<li><em>issued</em> - is the issue date of the order.</li>
<li><em>range</em> - only applies to bids and is the maximum distance from the origin station that an ask will be allowed to match.</li>
</ul>
<p>Order book data is the most current out of game view of the markets and is therefore very important data for many traders. Some important properties about order book data:</p>
<ul>
<li>At time of writing, order book data is refreshed every five minutes. That is, your view of market data may be up to five minutes old. You can refer to the <em>expires</em> response header to determine when the order book will be refreshed.</li>
<li>Order book data for most active regions can not be returned in a single API call (unless filtering for a single type). Instead, book data is “paged” across multiple calls. The requested page is controlled by the “page” argument as shown in the URL above. The ESI does not report how many total pages are available. The normal solution is to continue to retrieve pages until a page is returned containing less than 10000 orders. This indicates the last page available for the query. The “page” argument is ignored for requests filtered by type.</li>
<li>Order book data may include orders from player-owned structures, some of which may be non-public. Orders from non-public structures cause problems for out of game analysis because the ESI currently provides no way to discover the location of these structures. Crowd sourced data can be used in these cases (see <a href="#example-3---trading-rules-build-an-order-matching-algorithm">Example 3 - Trading Rules: Build an Order Matching Algorithm</a> below).</li>
</ul>
<blockquote>
<h3 id="order-book-data-gaps">Order Book Data Gaps</h3>
<p>At various points in time, CCP’s order book endpoints have had gaps in the data they provide. These gaps take the form of orders missing from snapshots. A common pattern is to see an order appear in one snapshot, then disappear, then appear again in a later snapshot. The only cases where an order should exit the order book is if it is canceled or completely filled. Also, an order which leaves the order book should never re-appear as order IDs are unique. We’ll see this problem, and a work-around, in <a href="#example-4---unpublished-data-build-a-trade-heuristic">Example 4</a> below. At time of writing, this problem was still occurring, even in the new ESI based endpoints.</p>
</blockquote>
<p>Order book data can also be requested directly from player-owned structures. This is done using the <a href="https://esi.tech.ccp.is/latest/#!/Market/get_markets_structures_structure_id">“get markets structures”</a> endpoint. Some player-owned markets are not public, despite their buy orders appearing the regional market, but for those that allow access, the format of the results is identical to the format returned by the <a href="https://esi.tech.ccp.is/latest/#!/Market/get_markets_region_id_orders">“get market region orders”</a> endpoint.</p>
<h3 id="pricing-data-endpoints">Pricing Data Endpoints</h3>
<p>Certain industrial calculations, such as reprocessing tax, require reference price data computed by CCP on a daily basis. This data is available to third party developers using the <a href="https://esi.tech.ccp.is/latest/#!/Market/get_markets_prices">“get market prices”</a> endpoint. A request to this endpoint will return an array of price data in this format:</p>
<div class="sourceCode"><pre class="sourceCode json"><code class="sourceCode json"><span class="fu">{</span>
<span class="dt">"type_id"</span><span class="fu">:</span> <span class="dv">32772</span><span class="fu">,</span>
<span class="dt">"average_price"</span><span class="fu">:</span> <span class="fl">501374.49</span><span class="fu">,</span>
<span class="dt">"adjusted_price"</span><span class="fu">:</span> <span class="fl">502330.89</span>
<span class="fu">}</span></code></pre></div>
<p>where:</p>
<ul>
<li><em>type_id</em> - gives the asset type.</li>
<li><em>average_price</em> - gives a rolling average price<a href="#fn6" class="footnoteRef" id="fnref6"><sup>6</sup></a>.</li>
<li><em>adjusted_price</em> - gives a formulaic price which is used as a reference price in several calculations. The formula by which this price is computed is not publicly documented.</li>
</ul>
<p>We document this endpoint because some strategies discussed in later chapters need to compute certain industrial formulas.</p>
<h3 id="structure-location-endpoints">Structure Location Endpoints</h3>
<p>The last market data endpoint we document is the <a href="https://esi.tech.ccp.is/latest/#!/Universe/get_universe_structures_structure_id">“get universe structure id”</a> endpoint. This is an authenticated endpoint which provides player-owned structure information in the format:</p>
<div class="sourceCode"><pre class="sourceCode json"><code class="sourceCode json"><span class="fu">{</span>
<span class="dt">"name"</span><span class="fu">:</span> <span class="st">"V-3YG7 VI - The Capital"</span><span class="fu">,</span>
<span class="dt">"solar_system_id"</span><span class="fu">:</span> <span class="dv">30000142</span>
<span class="fu">}</span></code></pre></div>
<p>where:</p>
<ul>
<li><em>name</em> - gives the name of the player-owned structure.</li>
<li><em>solar_system_id</em> - gives the solar system where the structure resides.</li>
</ul>
<p>The use for this endpoint is not obvious until one needs to calculate which orders will match in a given region. As described above, the buy orders which match a given sell order are determined by the location of the sell order, and the range and location of each buy order. Therefore, the location of player-owned structures must be known in order to determine whether buy orders submitted at those structures can potentially match. At time of writing, the structure location endpoint is the only third party API which provides access to the location of public player-owned structures. However, as we discussed in <a href="#order-book-data-endpoints">Order Book Data Endpoints</a>, the order book for a region may also display buy orders from non-public player-owned structures. The structure location endpoint can not be used to determine the location of these structures unless the (authenticated) caller is authorized to view this data (for example, the caller is a member of the corporation which owns the player-owned structure). Fortunately, there is at least one third party data source that attempts to document the location of non-public structures. We show an example of using the structure location endpoint, as well as other third party data sources, when we construct an order matching algorithm in <a href="#example-3---trading-rules-build-an-order-matching-algorithm">Example 3</a> below.</p>
<h3 id="discovering-trade-data">Discovering Trade Data</h3>
<p>As noted above, CCP currently does not provide an API endpoint for retrieving individual trades. This lack of data is limiting in some cases, but fortunately a portion of trades can be inferred by observing changes in order book data. This approach is effective for trades that do not completely consume a standing limit order. However, limit orders which are removed from the order book can not be distinguished from canceled orders. Thus, the best we can do is rely on heuristics to attempt to estimate trades we can’t otherwise infer. Because CCP publishes daily trade volume, we do have some measure of how close our heuristics match reality. We’ll derive a simple trade estimation heuristic in <a href="#example-4---unpublished-data-build-a-trade-heuristic">Example 4</a> below.</p>
<h2 id="tools-used-in-this-book">Tools used in this Book</h2>
<p>This section provides an introduction to the tools we use to develop strategies in the remainder of the book. As you may have surmised, access to historic and live market data is critically important for analysis, back test, live testing and execution of trading strategies. Many third party sites provide this data in various formats. At Orbital Enterprises, we’ve created our own market data collection tools which we describe below. Our tools, as well as most modern tools (including the EVE Swagger Interface), use web interfaces annotated with <a href="http://swagger.io/">Swagger</a>. We therefore provide a brief introduction to Swagger along with a few tips for working with Swagger-annotated endpoints. The <a href="https://developers.eveonline.com/resource/resources">EVE Static Data Export (SDE)</a> is another critical resource for third party developers and is needed for some of the strategies we described in this book. The SDE is provided as a raw data export which most players acquire themselves. At Orbital Enterprises, we’ve created an online tool for accessing the SDE which we use in our examples. We describe this tool below. Finally, we briefly introduce <a href="http://jupyter.org/">Jupyter</a> which has quickly become the de facto data science tool in python. Most of the examples we provide in the book are shared as Jupyter notebooks on our <a href="https://github.com/OrbitalEnterprises/eve-market-strategies">GitHub site</a>.</p>
<h3 id="market-data-tools">Market Data Tools</h3>
<p>Orbital Enterprises hosts a <a href="https://evekit.orbital.enterprises//#/md/main">market collection service</a> which provides historic and live access to book data and daily market snapshots (the “Order Book Data” and “Market History” endpoints described above, respectively). The service exposes a Swagger annotated API which can be accessed <a href="https://evekit.orbital.enterprises//#/md/ui">interactively</a>. Historic data is uploaded nightly to <a href="https://storage.googleapis.com/evekit_md">Google Storage</a> organized by date. Although the entire history maintained by the site is accessible through the service API, for research and back testing purposes it is usually more convenient to download the data in bulk from the Google Storage site.</p>
<blockquote>
<h3 id="about-swagger">About Swagger</h3>
<p><a href="http://swagger.io/">Swagger</a> is a configuration language for describing REST-based web services. A Swagger-annotated web site will normally provide a <code>swagger.json</code> file which defines the services provided by the site. For example, CCP’s EVE Swagger Interface provides <a href="https://esi.tech.ccp.is/latest/swagger.json?datasource=tranquility">this <code>swagger.json</code> file</a>.</p>
<p>The power of Swagger is that the <code>swagger.json</code> file can be provided as input to tools which automate the generation of documentation and client code. For example, the <a href="http://petstore.swagger.io/">Swagger UI</a> will generate an interactive UI for any valid Swagger specification. The <a href="http://editor.swagger.io/#/">Swagger Editor</a> has similar capabilities but will also generate clients (and servers) in a variety of programming languages. In most cases, you won’t ever need to manually inspect a Swagger configuration file (much less learn the configuration language) as the tooling does all the hard work for you.</p>
<p>In this book, we introduce many APIs using the Swagger UI. You can follow along by browsing to the generic <a href="http://petstore.swagger.io/">Swagger UI</a> and inserting the URL for the appropriate <code>swagger.json</code> configuration file. Most of our code samples are in Python for which we use the <a href="https://github.com/Yelp/bravado">Bravado</a> Swagger client generator. We’ll describe Bravado in more detail below.</p>
<p><strong>NOTE:</strong> the generic Swagger UI will <em>not</em> work with authorized endpoints of the ESI. This is because of the way single sign-on is implemented with the ESI servers. Using authorized endpoints from batch (i.e. non-interactive) code is likewise challenging. One work around is to use a proxy like the <a href="https://github.com/OrbitalEnterprises/orbital-esi-proxy">ESI Proxy</a> which we use at Orbital Enterprises. This proxy handles OAuth authorization flows automatically, exposing a much simpler interface to our market strategy code.</p>
</blockquote>
<p>Let’s use the Swagger UI to introduce the Orbital Enterprises market collection service. You can follow along by browsing to the <a href="https://evekit.orbital.enterprises//#/md/ui">interactive UI</a>. The UI lists three endpoints:</p>
<figure>
<img src="img/mcs_view_1.PNG" alt="EveKit MarketData Server" /><figcaption>EveKit MarketData Server</figcaption>
</figure>
<p>These endpoints provide the following functions:</p>
<ul>
<li><em>history</em> - retrieves market history by type, region and date.</li>
<li><em>book</em> - retrieves the order book snapshot by type and region closest to a specified time stamp.</li>
<li><em>livebook</em> - a specialized version of the <em>book</em> endpoint which always retrieves the latest order book snapshot for a region and list of type IDs.</li>
</ul>
<p>Each endpoint can be expanded to view documentation and call the service. For example, expanding the <em>history</em> endpoint reveals:</p>
<figure>
<img src="img/mcs_view_2.PNG" alt="Expanded History Endpoint" /><figcaption>Expanded History Endpoint</figcaption>
</figure>
<p>Filling in the <em>typeID</em>, <em>regionID</em> and <em>date</em> fields with “34”, “10000002” and “2017-01-15” returns the following result (Click “Try it out!”):</p>
<figure>
<img src="img/mcs_view_3.PNG" alt="Market history for Tritanium (type 34) in The Forge (region 10000002) on 2017-01-15" /><figcaption>Market history for Tritanium (type 34) in The Forge (region 10000002) on 2017-01-15</figcaption>
</figure>
<p>The fields in the result match the “Market History” ESI endpoint with following additions:</p>
<ul>
<li><em>typeID</em> - the type ID of the result.</li>
<li><em>regionID</em> - the region ID of the result.</li>
<li><em>date</em> - the retrieval date timestamp in milliseconds UTC since the epoch (1970-01-01 00:00:00).</li>
</ul>
<p>The <em>book</em> endpoint has a similar interface but since order book snapshots have five minute resolution (based on the cache timer for the endpoint), you can provide a full timestamp. The endpoint will return the latest book snapshot before the specified date/time. Here is the result of the same query above at 10:00 (UTC):</p>
<figure>
<img src="img/mcs_view_4.PNG" alt="Book snapshot for Tritanium (type 34) in the Forge (region 10000002) at 2017-01-15T10:00:00" /><figcaption>Book snapshot for Tritanium (type 34) in the Forge (region 10000002) at 2017-01-15T10:00:00</figcaption>
</figure>
<p>The result is a JSON object where the <em>bookTime</em> field records the snapshot time in milliseconds UTC since the epoch. The <em>orders</em> field list the buy and sell orders in the order book snapshot. The fields in the order results match the “Order Book Data” ESI endpoint with some slight modifications (e.g. timestamps are converted to milliseconds UTC since the epoch) and the addition of <em>typeID</em> and <em>regionID</em> fields.</p>
<p>The <em>livebook</em> endpoint is identical to the <em>book</em> endpoint with two main differences:</p>
<ol type="1">
<li>You may specify multiple type IDs (up to 100 at time of writing). The result will contain order books for all the requested types.</li>
<li>The result always represents the latest live data. That is, there is no <em>date</em> argument to this endpoint.</li>
</ol>
<p>The <em>livebook</em> endpoint is most useful for live testing or live execution of trading strategies. We use this endpoint in the later chapters for specific strategies.</p>
<p>As we noted above, the endpoints of the market collection service are most useful for casual testing or for retrieving live data for running strategies. For back testing, it is usually more convenient to download historic data in bulk to local storage. The format of historic data is described on the <a href="https://evekit.orbital.enterprises//#/md/main">market collection service site</a>. We introduce python code to read historic data below, either directly from Google Storage, or from local storage.</p>
<h3 id="static-data-export-sde">Static Data Export (SDE)</h3>
<p>The EVE Static Data Export is a regularly released static dump of in-game reference material. We’ve already seen data provided by the SDE in the last section - the numeric values for Tritanium (type ID 34) and The Forge (region ID 10000002) were provided by the SDE. The SDE is released by CCP at the <a href="https://developers.eveonline.com/resource/resources">Developer Resources Site</a>. The modern version of the SDE consists of <a href="http://www.yaml.org/">YAML</a> formatted files. However, most players find it more convenient to access the SDE from a relational database. Steve Ronuken provides conversions of the raw SDE export to various database formats at his <a href="https://www.fuzzwork.co.uk/dump/">Fuzzworks site</a>.</p>
<p>At Orbital Enterprises, we expose the latest two releases of the SDE as an <a href="https://evekit.orbital.enterprises//#/sde/main">online service</a>. The underlying data is just Steve Ronuken’s MySQL conversion accessed through a Swagger-annotated web service we provide. If you don’t want to download the SDE yourself, you may want to use our online service instead. Most of the examples we present in this book use the Orbital Enterprises service.</p>
<p>Because our service is Swagger-annotated, there is a ready made <a href="https://evekit.orbital.enterprises//#/sde/ui">interactive service</a> you can use to access the SDE:</p>
<figure>
<img src="img/sde_view_1.PNG" alt="Orbital Enterprises Online SDE UI" /><figcaption>Orbital Enterprises Online SDE UI</figcaption>
</figure>
<p>The “Release” drop down at the top can be used to select which SDE release to query against (the default is the latest release). At time of writing, we always maintain the two latest releases. We may consider maintaining more releases in the future. Queries against the online service use JSON expressions which are explained on the <a href="https://evekit.orbital.enterprises//#/sde/main">main site</a>. As an example, let’s look at a query to determine the type ID for Tritanium. First expand the “Inventory” section, then select “/inv/type”:</p>
<figure>
<img src="img/sde_view_2.PNG" alt="Partial view of the Inventory - Type endpoint" /><figcaption>Partial view of the Inventory - Type endpoint</figcaption>
</figure>
<p>We’ll search by partial name. Scroll down until the “typeName” field is visible and replace the default query value with <code>{ like: '%trit%' }</code>. Then click “Try it out!” (or just hit enter). You’ll see a result similar to the following:</p>
<figure>
<img src="img/sde_view_3.PNG" alt="SDE query result" /><figcaption>SDE query result</figcaption>
</figure>
<p>The result includes all types with names that contain the string “trit” (case insensitive). There are many such types, but the first result in this example happens to be the type we were searching for. Most of the market strategies we describe in this book rely on data in the “Inventory” and “Map” sections of the SDE. You can find reasonably recent documentation on the data stored in the SDE at the crowd sources <a href="http://eveonline-third-party-documentation.readthedocs.io/en/latest/index.html">Third Party Developer</a> documentation site. We provide more explicit documentation in the sections below where we use the SDE.</p>
<h3 id="jupyter-notebook">Jupyter Notebook</h3>
<p>Our method for developing trading strategies could loosely be called “data science”, meaning we use scientific methods and tools for extracting knowledge or insight from raw data. Our main tool is the Python programming language around which a rich set of libraries and methodologies have been developed to support data science. Strategy development is an iterative process, and during the early stages of development it is useful to have tools which are interactive in nature. The <a href="http://jupyter.org/">Jupyter Project</a> and its predecessor <a href="https://ipython.org/">iPython</a> are arguably the most popular interactive tools for data science with Python<a href="#fn7" class="footnoteRef" id="fnref7"><sup>7</sup></a>. When combined with <a href="http://www.numpy.org/">NumPy</a> and <a href="http://pandas.pydata.org/">Pandas</a>, the Jupyter platform provides a very capable interactive data science environment. We use this environment almost exclusively in the examples we describe in this book. It is not mandatory, but you’ll get much more out of this book if you install Jupyter yourself and try out some of our examples.</p>
<p>The easiest way to get started with Jupyter is to install <a href="https://www.continuum.io/downloads">Anaconda</a> which is available for Windows, Mac and Linux. Anaconda is a convenient packaging of several open source data science tools for Python (also R and Scala), and also includes Jupyter. Once you’ve installed Anaconda, you can get started with Jupyter by following the <a href="https://jupyter.readthedocs.io/en/latest/running.html">quickstart</a> instructions (essentially just <code>jupyter notebook</code> in a shell and you’re ready). If you’re reasonably familiar with Python you can crash ahead and click “New -> Python 3” from your local Jupyter instance to create your first notebook. If you’d like a more comprehensive introduction, we like <a href="https://www.datacamp.com/community/tutorials/tutorial-jupyter-notebook">this tutorial</a>.</p>
<blockquote>
<h3 id="python-2-or-python-3">Python 2 or Python 3?</h3>
<p>If you’re familiar with Python, you’ll know that Python 3 is the recommended environment for new code but, unfortunately, Python 3 broke compatibility with Python 2 in many areas. Moreover, the large quantity of code still written in Python 2 (at time of writing) often leaves developers with a difficult decision as to which environment to use. Fortunately, all of the data science libraries we need for this book have been ported to Python 3. So we’ve made the decision to use Python 3 exclusively in this book. If you <em>must</em> use Python 2, you’ll find that most of our examples can be back-ported without difficulty. However, if you don’t have a strong reason to use Python 2, we recommend you stick with Python 3.</p>
</blockquote>
<p>The main interface to Jupyter is the notebook, which is a language <em>kernel</em> combined with code, text, graphs, etc. A language kernel is a back end capable of executing code in a given language. All of the examples we present in this section use the Python 3 language kernel, but Jupyter allows you to install other kernels as well (e.g. Python 2, R, Scala, Java, etc.). The code within a notebook is executed by the kernel with output displayed in the notebook. Text, graphic and other non-code artifacts are handled by the Jupyter environment itself and provide a way to document your work, or develop instructional material (as we’re doing in this book). Notebooks naturally keep a history of your actions (numbered code or text sections) and allow you to edit and re-run previous bits of code. That is, to iterate on your experiments. Finally, notebooks automatically checkpoint (i.e. regularly save your progress) and can be saved and restored at a later time. Make sure you run your notebook on a reasonably powerful machine, however, as deeper data analysis will use up significant memory.</p>
<p>The environment installed from Anaconda has most of the code we’ll need, but from time to time you may need to install other code. In this book, we do this in two cases:</p>
<ol type="1">
<li><p>We’ll install a few libraries that typically aren’t included in Anaconda. In fact, we’ll do this almost immediately in the first example below so that we can use the <a href="https://github.com/Yelp/bravado">Bravado</a> library for interacting with Swagger-annotated web services.</p></li>
<li><p>As we work through the examples, we’ll begin to develop a set of useful libraries we’ll want to use in later examples. We <em>could</em> copy this code to each of our notebooks but that would start to clutter our examples. Instead, we’ll show you how to install these libraries in the default path used by the Jupyter kernel.</p></li>
</ol>
<p>We’ll provide instructions for installing missing libraries in the examples where they are needed. Including your own code into Jupyter is a matter of ensuring the path to your packages is included in the “python path” used by Jupyter. If you’re already familiar with Python, you’re free to choose your favorite way of adding your local path. If you’re less familiar with Python, we suggest adding your packages to your local <code>.ipython</code> folder which is created the first time you start a Python kernel in Jupyter. This directory will be created in the home directory of the user which started the kernel.</p>
<blockquote>
<h3 id="python-virtual-environments">Python Virtual Environments</h3>
<p>Notebooks provide a basic level of code isolation, but all notebooks share the set of packages installed in the Python kernel (as well as any default modifications made to the Python path). This means that any new packages you install (such as those we provide instructions for in some of the examples) will affect all notebooks. This can cause version problems when two different notebooks rely on two different versions of the same package. For this reason, Python professionals try to avoid installing project specific packages in the “global” Python kernel. Instead, the pros create one or more “virtual environments” which isolate the customization needed for specific work. This lets you divide your experiments so that work in one experiment doesn’t accidentally break the work you’ve already done in another experiment.</p>
<p>Virtual environments are an advanced topic which we won’t try to cover here. Interested parties should check out the <a href="https://pypi.python.org/pypi/virtualenv">virtualenv</a> package or read up on using <a href="https://conda.io/docs/">Conda</a> to set up isolated development environments. In our experience, it is easier to use <code>conda</code> to create separate Jupyter environments, but instructions exist for using <code>virtualenv</code> to do this as well. We document our Conda setup at the end of this chapter for the curious.</p>
</blockquote>
<h2 id="example-analysis-and-calculations">Example Analysis and Calculations</h2>
<p>We finish this chapter with code examples illustrating basic analysis techniques we use in the remainder of the book. If you’d like to follow along, you’ll need to install Jupyter as described in the previous section. As always, you can find our code as well as Jupyter notebooks in the <code>code</code> directory in our <a href="https://github.com/OrbitalEnterprises/eve-market-strategies">GitHub project</a>.</p>
<h3 id="example-1---data-extraction-make-a-graph-of-market-history">Example 1 - Data Extraction: Make a Graph of Market History</h3>
<p>In this example, we’re going to create a simple graph of market history (i.e. daily price snapshots) for a year of data. We’ve arbitrarily chosen to graph Tritanium in Forge. When we’re done, we’ll have a graph like the following:</p>
<figure>
<img src="img/ex1_graph.png" alt="Tritanium price over a year in The Forge" /><figcaption>Tritanium price over a year in The Forge</figcaption>
</figure>
<p>We’ll use this simple example to introduce basic operations we’ll use throughout this book. If you want to follow along, you can download the <a href="code/book/Example_1_Data_Extraction.ipynb">Jupyter notebook</a> for this example. Since this is our first example, we’ll be extra verbose in our explanation.</p>
<p>We’ll create our graph in four steps:</p>
<ol type="1">
<li>We’ll use the Static Data Export (SDE) to look up type and region ID, respectively, for “Tritanium” and “The Forge”.</li>
<li>Next, we’ll construct a date range for a year of data from the current date.</li>
<li>Then, we’ll use the market data service to download daily market snapshots for our date range, and store the data in a Pandas <a href="http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe">DataFrame</a>.</li>
<li>Finally, we’ll plot average price from the DataFrame.</li>
</ol>
<p>We’ll expand on a few variants of these steps at the end of the example to set up for later examples. But before we can do any of this, we need to install <a href="https://github.com/Yelp/bravado">Bravado</a> in order to access Swagger-annotated web services. You can install bravado as follows (do this before you start Jupyter):</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">pip</span> install bravado</code></pre></div>
<blockquote>
<h3 id="installing-bravado-on-windows">Installing Bravado on Windows</h3>
<p>On Windows, you may see an error message about missing C++ tools when building the “twisted” library. You can get these tools for free from Microsoft at <a href="http://landinghub.visualstudio.com/visual-cpp-build-tools">this link</a>. Once these tools are installed, you should be able to complete the install of bravado.</p>
</blockquote>
<p>Once you’ve installed Bravado, you can start Jupyter and create a Python 3 notebook. Almost every notebook we create will start with an import preamble that brings in a few key packages. So we’ll start with this Jupyter cell:</p>
<figure>
<img src="img/ex1_cell1.PNG" alt="Import Preamble" /><figcaption>Import Preamble</figcaption>
</figure>
<p>The first thing we need to do is use the SDE to lookup the type and region ID, respectively, for “Tritanium” and “The Forge”. Yes we already know what these are from memory; we’ll write the code anyway to demonstrate the process. The next two cells import Swagger and create a client which will connect to the online SDE hosted by Orbital Enterprises:</p>
<figure>
<img src="img/ex1_cell2.PNG" alt="Create a Swagger client for the online SDE" /><figcaption>Create a Swagger client for the online SDE</figcaption>
</figure>
<p>The <code>config</code> argument to the Swagger client turns off response validation and instructs the client to return raw Python objects instead of wrapping results in typed Python classes. We turn off response validation because some Swagger endpoints return slightly sloppy but otherwise usable responses. We find working with raw Python objects to be easier than using typed classes, but this is a matter of personal preference. We’re now ready to look up type ID which is done in the following cell:</p>
<figure>
<img src="img/ex1_cell3.PNG" alt="Call and result of looking up Tritanium" /><figcaption>Call and result of looking up “Tritanium”</figcaption>
</figure>
<p>We use the <code>getTypes</code> method on the <code>Inventory</code> endpoint, selecting on the <code>typeName</code> field (using the syntax described <a href="https://evekit.orbital.enterprises//#/sde/main">here</a>). The result is an array of all matches to our query. In this case, there is only one type called “Tritanium” which is the first element of the result array.</p>
<blockquote>
<h3 id="pro-tip-getting-python-function-documentation">Pro Tip: Getting Python Function Documentation</h3>
<p>If you forget the usage of a Python function, you can bring up the Python docstring using the syntax <code>?function</code>. Jupyter will display the docstring in a popup. In the example above, you would use <code>?sde_client.Inventory.getTypes</code> to view the docstring.</p>
</blockquote>
<p>Similarly, we can use the <code>Map</code> endpoint to lookup the region ID for “The Forge”:</p>
<figure>
<img src="img/ex1_cell4.PNG" alt="Call and result of looking up The Forge" /><figcaption>Call and result of looking up “The Forge”</figcaption>
</figure>
<p>Of course, we only need the type and region ID, so we’ll tidy things up in the next cell and extract the fields we need into local variables:</p>
<figure>
<img src="img/ex1_cell5.PNG" alt="Extract and save type and region ID" /><figcaption>Extract and save type and region ID</figcaption>
</figure>
<p>Next, we need to create a date range for the days of market data we’d like to display. This is straightforward using <code>datetime</code> and the Pandas function <code>date_range</code>:</p>
<figure>
<img src="img/ex1_cell6.PNG" alt="Create a date range for the days we want to plot" /><figcaption>Create a date range for the days we want to plot</figcaption>
</figure>
<p>With these preliminaries out of the way, we’re now ready to start extracting market data. To do this, we’ll need a Swagger client pointing to the Orbital Enterprises market data service. As a test, we can call this service with the first date in our date range:</p>
<figure>
<img src="img/ex1_cell7.PNG" alt="Create a Swagger market data client and extract a day of market history" /><figcaption>Create a Swagger market data client and extract a day of market history</figcaption>
</figure>
<p>We call the <code>history</code> method on the <code>MarketData</code> endpoint passing a type ID, region ID, and the date we want to extract. This method can only be used to lookup data for a single date, so the result is a single JSON object with the requested information. The <code>history</code> endpoint may not have any data for a date we request (e.g. because the date is too far in the past, or the service has not yet loaded very recent data). It is therefore useful to check what happens when we request a missing date:</p>
<figure>
<img src="img/ex1_cell8.PNG" alt="Result of requesting a missing date" /><figcaption>Result of requesting a missing date</figcaption>
</figure>
<p>The result is a nasty stack trace due to an <code>HTTPNotFound</code> exception. We’ll need to handle this exception when we request our date range in case any data is missing.</p>
<blockquote>
<h3 id="using-a-response-object-instead-of-exceptions">Using a Response Object Instead of Exceptions</h3>
<p>The Bravado client provides an alternative way to handle erroneous responses if you’d prefer not to handle exceptions. This is done by requesting a <code>response</code> object as the result of a call. To create a <code>response</code> object, change your call syntax from:</p>
<pre><code>result = client.Endpoint.method(...).result()</code></pre>
<p>to:</p>
<pre><code>result, response = client.Endpoint.method(...).result()</code></pre>
<p>The raw response to a call will be captured in the <code>response</code> variable which can be inspected for errors as follows:</p>
<pre><code>if response.status_code != 200:
# An error occurred
...</code></pre>
<p>You can either handle exceptions or use response objects according to your preference. We choose to simply handle exceptions in this example.</p>
</blockquote>
<p>Now that we know how to retrieve market history for a single day, we can retrieve our entire date range with a simple loop:</p>
<figure>
<img src="img/ex1_cell9.PNG" alt="Retrieve history for our date range" /><figcaption>Retrieve history for our date range</figcaption>
</figure>
<p>The result is an array of market history, hopefully for every day we requested (the last day in the range will usually be missing because the market data service hasn’t loaded it yet). Now that we have our market data, we need to turn it into a plot. We’ll use a Pandas <a href="http://pandas.pydata.org/pandas-docs/stable/dsintro.html#dataframe"><code>DataFrame</code></a> for this. If you haven’t already, you’ll want to read up on Pandas as we’ll use its data structures and functions throughout the book. There are many ways to create a <code>DataFrame</code> but in our case the most convenient approach will be to base our <code>DataFrame</code> on the array of market data we just loaded. All that is missing is an index for the <code>DataFrame</code>. The natural choice here is to use the <code>date</code> field in each day of market history. However, these dates are not in a format understood by Pandas so we’ll have to convert them. This is easy to do using <code>datetime</code> again. Here’s an example which converts the <code>date</code> field from the first day of market history:</p>
<figure>
<img src="img/ex1_cell10.PNG" alt="Date conversion for the first day of market history" /><figcaption>Date conversion for the first day of market history</figcaption>
</figure>
<p>We’ll turn the converter into a function for convenience, then create our <code>DataFrame</code>:</p>
<figure>
<img src="img/ex1_cell11.PNG" alt="Create a DataFrame from market history" /><figcaption>Create a DataFrame from market history</figcaption>
</figure>
<p>Last but not least, we’re ready to plot our data. Simple plots are very easy to create with a <code>DataFrame</code>. Here, we plot average price for our date range:</p>
<figure>
<img src="img/ex1_cell12.PNG" alt="Plot of Average Price" /><figcaption>Plot of Average Price</figcaption>
</figure>
<p>And that’s it! You’ve just created a simple plot of market history.</p>
<p>We walked through this example in verbose fashion to demonstrate some of the key techniques we’ll need for analysis later in the book. As you develop your own analysis, however, you’ll likely switch to an iterative process which may involve executing parts of a notebook multiple times. You’ll want to avoid re-executing code to download data unless absolutely necessary as more complicated analysis will require order book data which is substantially larger than market history. If you know you’ll be doing this often, you may find it more convenient to download historic data to your local disk and read the data locally instead of calling a web service.</p>
<p>All data available on the market data service site we used in this example is uploaded daily to the Orbital Enterprises Google Storage site (you can find full documentation <a href="https://evekit.orbital.enterprises//#/md/main">here</a>). Historic data is organized by day. You can find data for a given day at the URL: <code>https://storage.googleapis.com/evekit_md/YYYY/MM/DD</code>. At time of writing, six files are stored for each day<a href="#fn8" class="footnoteRef" id="fnref8"><sup>8</sup></a>:</p>
<table>
<colgroup>
<col style="width: 26%" />
<col style="width: 73%" />
</colgroup>
<thead>
<tr class="header">
<th style="text-align: left;">File</th>
<th style="text-align: left;">Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: left;">market_YYYYMMDD.tgz</td>
<td style="text-align: left;">Market history for all regions and types for the given day.</td>
</tr>
<tr class="even">
<td style="text-align: left;">interval_YYYYMMDD_5.tgz</td>
<td style="text-align: left;">Order book snapshots for all regions and types for the given day.</td>
</tr>
<tr class="odd">
<td style="text-align: left;">market_YYYYMMDD.bulk</td>
<td style="text-align: left;">Market history in “bulk” form for all regions and types for the given day.</td>
</tr>
<tr class="even">
<td style="text-align: left;">interval_YYYYMMDD_5.bulk</td>
<td style="text-align: left;">Order book snapshots in “bulk” form for all regions and types for the given day.</td>
</tr>
<tr class="odd">
<td style="text-align: left;">market_YYYYMMDD.index.gz</td>
<td style="text-align: left;">Market history bulk file index for the given day.</td>
</tr>
<tr class="even">
<td style="text-align: left;">interval_YYYYMMDD_5.index.gz</td>
<td style="text-align: left;">Order book snapshot bulk file for the given day.</td>
</tr>
</tbody>
</table>
<p>We’ll discuss the market history files here, and leave the order book files for the next example.</p>
<p>Historic market data is optimized for two use cases:</p>
<ol type="1">
<li>Download for local storage; and,</li>
<li>Efficient online access using HTTP “range” requests.</li>
</ol>
<p>The tar’d archive files (e.g. tgz files), when extracted, contain files of the form <code>market_TYPE_YYYYMMDD.history.gz</code> where <code>TYPE</code> is the type ID for which history is recorded in the file, and <code>YYYYMMDD</code> is the date of the market history. The content of each file is a comma-separated table of market history for all regions on the given day. Let’s look at a sample file:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">wget</span> -q https://storage.googleapis.com/evekit_md/2017/01/01/market_20170101.tgz
$ <span class="kw">ls</span> -lh market_20170101.tgz
<span class="kw">-rw-r--r--+</span> 1 mark_000 mark_000 2.2M Jan 31 02:55 market_20170101.tgz
$ <span class="kw">tar</span> xvzf market_20170101.tgz
<span class="kw">...</span> about 10000 files extracted ...
$ <span class="kw">zcat</span> market_34_20170101.history.gz <span class="kw">|</span> <span class="kw">head</span> -n 10
<span class="kw">34</span>,10000025,13,2.89,4.50,4.00,7319512,1483228800000
<span class="kw">34</span>,10000027,1,0.28,0.28,0.28,155501,1483228800000
<span class="kw">34</span>,10000028,44,4.80,4.80,4.80,12336476,1483228800000
<span class="kw">34</span>,10000029,19,2.00,5.00,3.50,41728843,1483228800000
<span class="kw">34</span>,10000030,735,4.60,4.76,4.64,419745507,1483228800000
<span class="kw">34</span>,10000016,964,3.98,4.66,4.36,225219117,1483228800000
<span class="kw">34</span>,10000018,4,2.03,2.03,2.03,3046465,1483228800000
<span class="kw">34</span>,10000020,367,4.50,4.50,4.50,264925396,1483228800000
<span class="kw">34</span>,10000021,1,4.48,4.48,4.48,4500000,1483228800000
<span class="kw">34</span>,10000022,3,1.51,1.51,1.51,10145393,1483228800000</code></pre></div>
<p>The columns in the file are:</p>
<ul>
<li><em>type ID</em> - the type ID for the current row.</li>
<li><em>region ID</em> - the region ID for the current row.</li>
<li><em>order count</em> - the number of market orders for this type in this region on this day.</li>
<li><em>low price</em> - low trade price for this type in this region on this day.</li>
<li><em>high price</em> - high trade price for this type in this region on this day.</li>
<li><em>average price</em> - average trade price for this type in this region on this day.</li>
<li><em>volume</em> - daily volume for this type in this region on this day.</li>
<li><em>date</em> - date of snapshot in milliseconds UTC (since the epoch).</li>
</ul>
<p>The data stored in the bulk files has the same format but is organized differently in order to support efficient online requests using an HTTP range header. We construct the bulk file by concatenating each of the individual compressed market files. This results in a file with roughly the same size as the archive, but which needs an index in order to recover market history for a particular type. This is the purpose of the market index file, which records the byte range for each market type stored in the bulk file. Here are the first ten lines for the index file for our sample date:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">curl</span> -s https://storage.googleapis.com/evekit_md/2017/01/01/market_20170101.index.gz <span class="kw">|</span> <span class="kw">zcat</span> <span class="kw">|</span> <span class="kw">head</span> -n 10
<span class="kw">market_18_20170101.history.gz</span> 0
<span class="kw">market_19_20170101.history.gz</span> 984
<span class="kw">market_20_20170101.history.gz</span> 1928
<span class="kw">market_21_20170101.history.gz</span> 3678
<span class="kw">market_22_20170101.history.gz</span> 4439
<span class="kw">market_34_20170101.history.gz</span> 5431
<span class="kw">market_35_20170101.history.gz</span> 8953
<span class="kw">market_36_20170101.history.gz</span> 12396
<span class="kw">market_37_20170101.history.gz</span> 15820
<span class="kw">market_38_20170101.history.gz</span> 19108</code></pre></div>
<p>Thus, to recover type 34 we need to extract bytes 5431 through 8952 (inclusive) from the bulk file. We can do this by using an HTTP “range” request as follows:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">curl</span> -s -H <span class="st">"range: bytes=5431-8952"</span> https://storage.googleapis.com/evekit_md/2017/01/01/market_20170101.bulk <span class="kw">|</span> <span class="kw">zcat</span> <span class="kw">|</span> <span class="kw">head</span> -n 10
<span class="kw">34</span>,10000025,13,2.89,4.50,4.00,7319512,1483228800000
<span class="kw">34</span>,10000027,1,0.28,0.28,0.28,155501,1483228800000
<span class="kw">34</span>,10000028,44,4.80,4.80,4.80,12336476,1483228800000
<span class="kw">34</span>,10000029,19,2.00,5.00,3.50,41728843,1483228800000
<span class="kw">34</span>,10000030,735,4.60,4.76,4.64,419745507,1483228800000
<span class="kw">34</span>,10000016,964,3.98,4.66,4.36,225219117,1483228800000
<span class="kw">34</span>,10000018,4,2.03,2.03,2.03,3046465,1483228800000
<span class="kw">34</span>,10000020,367,4.50,4.50,4.50,264925396,1483228800000
<span class="kw">34</span>,10000021,1,4.48,4.48,4.48,4500000,1483228800000
<span class="kw">34</span>,10000022,3,1.51,1.51,1.51,10145393,1483228800000</code></pre></div>
<p>Note that this is the same data we extracted from the downloaded archive.</p>
<p>As an illustration of code which makes use of downloaded data (if available), we’ll conclude this example with an introduction to library code we’ll be using in later examples. You can find our library code in the <a href="https://github.com/OrbitalEnterprises/eve-market-strategies/tree/master/code">code</a> folder on our GitHub site. You can incorporate our libraries into your notebooks by copying the <a href="https://github.com/OrbitalEnterprises/eve-market-strategies/tree/master/code/evekit">evekit</a> folder (and all its sub-folders) to your <code>.ipython</code> directory (or another convenient directory in your Python path).</p>
<p>We can re-implement this first example using the following modules from our libraries:</p>
<ol type="1">
<li><code>evekit.online.Download</code> - download archive files to local storage.</li>
<li><code>evekit.reference.Client</code> - make it easy to instantiate Swagger clients for commonly used services.</li>
<li><code>evekit.marketdata.MarketHistory</code> - make it easy to retrieve market history in various forms.</li>
<li><code>evekit.util</code> - a collection of useful utility functions.</li>
</ol>
<p>You can view <a href="code/book/Example_1_Data_Extraction_With_Libraries.ipynb">this Jupyter notebook</a> to see this example implemented with these libraries. We didn’t actually download any archives in the original example, but we include a download in the re-implemented example to demonstrate how these libraries function.</p>
<h3 id="example-2---book-data-compute-average-daily-spread">Example 2 - Book Data: Compute Average Daily Spread</h3>
<p>In this example, we turn our attention to analyzing order book data. The goal of this example is to compute the average daily <em>spread</em> for Tritanium in The Forge region for a given date. Spread is defined as the difference in price between the lowest priced sell order, and the highest priced buy order. Among other things, spread is an indication of whether market making will be profitable for a given asset, but we’ll get to that in a later chapter. The average daily spread is the average of the spreads for each order book snapshot. At time of writing, order book snapshots are generated every five minutes. So the average daily spread is just the average of the spread computed for each of the 288 book snapshots which make up a day.</p>
<p>We’ll start by getting familiar with the order book data endpoint on the <a href="https://evekit.orbital.enterprises//#/md/ui">Orbital Enterprises market data service</a> site:</p>
<figure>
<img src="img/ex2_order_book_ep.PNG" alt="Order book endpoint" /><figcaption>Order book endpoint</figcaption>
</figure>
<p>There are actually two endpoints, but we’re only looking at the historic endpoint for now. We’ll cover the “latest book” endpoint in a later chapter. As with the market history endpoint, the order book endpoint expects a type ID, a region ID, and a date. However, the date field may optionally include a time. The order book snapshot returned by the endpoint will be the latest snapshot <em>before</em> the specified time. Here’s an example of the result returned with type ID 34 (Tritanium), region ID 10000002 (The Forge), and timestamp <code>2017-01-01 12:02:00 UTC</code> (note that this endpoint can parse time zone specifications properly):</p>
<div class="sourceCode"><pre class="sourceCode json"><code class="sourceCode json"><span class="fu">{</span>
<span class="dt">"bookTime"</span><span class="fu">:</span> <span class="dv">1483272000000</span><span class="fu">,</span>
<span class="dt">"orders"</span><span class="fu">:</span> <span class="ot">[</span>
<span class="fu">{</span>
<span class="dt">"typeID"</span><span class="fu">:</span> <span class="dv">34</span><span class="fu">,</span>
<span class="dt">"regionID"</span><span class="fu">:</span> <span class="dv">10000002</span><span class="fu">,</span>
<span class="dt">"orderID"</span><span class="fu">:</span> <span class="dv">4708935394</span><span class="fu">,</span>
<span class="dt">"buy"</span><span class="fu">:</span> <span class="kw">true</span><span class="fu">,</span>
<span class="dt">"issued"</span><span class="fu">:</span> <span class="dv">1481705362000</span><span class="fu">,</span>
<span class="dt">"price"</span><span class="fu">:</span> <span class="dv">5</span><span class="fu">,</span>
<span class="dt">"volumeEntered"</span><span class="fu">:</span> <span class="dv">200000000</span><span class="fu">,</span>
<span class="dt">"minVolume"</span><span class="fu">:</span> <span class="dv">1</span><span class="fu">,</span>
<span class="dt">"volume"</span><span class="fu">:</span> <span class="dv">43345724</span><span class="fu">,</span>
<span class="dt">"orderRange"</span><span class="fu">:</span> <span class="st">"solarsystem"</span><span class="fu">,</span>
<span class="dt">"locationID"</span><span class="fu">:</span> <span class="dv">60002242</span><span class="fu">,</span>
<span class="dt">"duration"</span><span class="fu">:</span> <span class="dv">90</span>
<span class="fu">}</span><span class="ot">,</span>
<span class="fu">{</span>
<span class="dt">"typeID"</span><span class="fu">:</span> <span class="dv">34</span><span class="fu">,</span>
<span class="dt">"regionID"</span><span class="fu">:</span> <span class="dv">10000002</span><span class="fu">,</span>
<span class="dt">"orderID"</span><span class="fu">:</span> <span class="dv">4734310642</span><span class="fu">,</span>
<span class="dt">"buy"</span><span class="fu">:</span> <span class="kw">true</span><span class="fu">,</span>
<span class="dt">"issued"</span><span class="fu">:</span> <span class="dv">1483260173000</span><span class="fu">,</span>
<span class="dt">"price"</span><span class="fu">:</span> <span class="fl">4.9</span><span class="fu">,</span>
<span class="dt">"volumeEntered"</span><span class="fu">:</span> <span class="dv">100000000</span><span class="fu">,</span>
<span class="dt">"minVolume"</span><span class="fu">:</span> <span class="dv">1</span><span class="fu">,</span>
<span class="dt">"volume"</span><span class="fu">:</span> <span class="dv">99928181</span><span class="fu">,</span>
<span class="dt">"orderRange"</span><span class="fu">:</span> <span class="st">"station"</span><span class="fu">,</span>
<span class="dt">"locationID"</span><span class="fu">:</span> <span class="dv">60015026</span><span class="fu">,</span>
<span class="dt">"duration"</span><span class="fu">:</span> <span class="dv">90</span>
<span class="fu">}</span><span class="ot">,</span>
<span class="er">...</span> <span class="er">many</span> <span class="er">more</span> <span class="er">buy</span> <span class="er">orders</span> <span class="er">...</span>
<span class="fu">{</span>
<span class="dt">"typeID"</span><span class="fu">:</span> <span class="dv">34</span><span class="fu">,</span>
<span class="dt">"regionID"</span><span class="fu">:</span> <span class="dv">10000002</span><span class="fu">,</span>
<span class="dt">"orderID"</span><span class="fu">:</span> <span class="dv">4733287152</span><span class="fu">,</span>
<span class="dt">"buy"</span><span class="fu">:</span> <span class="kw">false</span><span class="fu">,</span>
<span class="dt">"issued"</span><span class="fu">:</span> <span class="dv">1483171052000</span><span class="fu">,</span>
<span class="dt">"price"</span><span class="fu">:</span> <span class="fl">4.69</span><span class="fu">,</span>
<span class="dt">"volumeEntered"</span><span class="fu">:</span> <span class="dv">5612007</span><span class="fu">,</span>
<span class="dt">"minVolume"</span><span class="fu">:</span> <span class="dv">1</span><span class="fu">,</span>
<span class="dt">"volume"</span><span class="fu">:</span> <span class="dv">747984</span><span class="fu">,</span>
<span class="dt">"orderRange"</span><span class="fu">:</span> <span class="st">"region"</span><span class="fu">,</span>
<span class="dt">"locationID"</span><span class="fu">:</span> <span class="dv">60007498</span><span class="fu">,</span>
<span class="dt">"duration"</span><span class="fu">:</span> <span class="dv">90</span>
<span class="fu">}</span><span class="ot">,</span>
<span class="fu">{</span>
<span class="dt">"typeID"</span><span class="fu">:</span> <span class="dv">34</span><span class="fu">,</span>
<span class="dt">"regionID"</span><span class="fu">:</span> <span class="dv">10000002</span><span class="fu">,</span>
<span class="dt">"orderID"</span><span class="fu">:</span> <span class="dv">4734141760</span><span class="fu">,</span>
<span class="dt">"buy"</span><span class="fu">:</span> <span class="kw">false</span><span class="fu">,</span>
<span class="dt">"issued"</span><span class="fu">:</span> <span class="dv">1483239477000</span><span class="fu">,</span>
<span class="dt">"price"</span><span class="fu">:</span> <span class="fl">4.77</span><span class="fu">,</span>
<span class="dt">"volumeEntered"</span><span class="fu">:</span> <span class="dv">46906</span><span class="fu">,</span>
<span class="dt">"minVolume"</span><span class="fu">:</span> <span class="dv">1</span><span class="fu">,</span>
<span class="dt">"volume"</span><span class="fu">:</span> <span class="dv">46906</span><span class="fu">,</span>
<span class="dt">"orderRange"</span><span class="fu">:</span> <span class="st">"region"</span><span class="fu">,</span>
<span class="dt">"locationID"</span><span class="fu">:</span> <span class="dv">60003079</span><span class="fu">,</span>
<span class="dt">"duration"</span><span class="fu">:</span> <span class="dv">90</span>
<span class="fu">}</span><span class="ot">,</span>
<span class="er">...</span> <span class="er">many</span> <span class="er">more</span> <span class="er">sell</span> <span class="er">orders</span> <span class="er">...</span>
<span class="ot">]</span><span class="fu">,</span>
<span class="dt">"typeID"</span><span class="fu">:</span> <span class="dv">34</span><span class="fu">,</span>
<span class="dt">"regionID"</span><span class="fu">:</span> <span class="dv">10000002</span>
<span class="fu">}</span></code></pre></div>
<p>The <code>bookTime</code> field reports the actual timestamp of this snapshot in milliseconds UTC since the epoch. In this example, the book time is <code>2017-01-01 12:00 UTC</code> because that is the latest book snapshot at requested time <code>2017-01-01 12:02 UTC</code>.</p>
<blockquote>
<h3 id="pro-tip-converting-timestamps">Pro Tip: Converting Timestamps</h3>
<p>If you plan to work with Orbital Enterprises raw data on a frequent basis, you’ll want to find a convenient tool for converting millisecond timestamps to human readable form. The author uses the <a href="https://chrome.google.com/webstore/detail/utime/kpcibgnngaaabebmcabmkocdokepdaki?utm_source=chrome-app-launcher-info-dialog">Utime Chrome plugin</a> for quick conversions. You’ll only need this when browsing the data manually. The evekit libraries (should you choose to use then) handle these conversions for you.</p>
</blockquote>
<p>Orders in the order book are contained in the <code>orders</code> array with buy orders appearing first, followed by sell orders. To make processing easier, buy orders are sorted with the highest priced orders first; and, sell orders are priced with the lowest priced orders first. Order sorting simplifies spread computations but there’s a catch in that a spread is only valid if the highest buy and lowest sell are eligible for matching (except for price, of course). That is, the spread is not always the difference between the highest price buy and the lowest price sell, because those orders may not be matchable. We see this behavior in the sample output above: the highest price buy order is for 5 ISK, but the lowest price sell order is 4.69 ISK. Even though the resulting spread would be negative, which can never happen according to market order matching rules, the orders are valid because they can not match: the buy order is ranged to the solar system Otanuomi but the sell order was placed in the Obe solar system. For the sake of simplicity, we’ll limit this example to computing the spread for buy and sell orders at a given station. We’ll use “Jita IV - Moon 4 - Caldari Navy Assembly Plant” which is the most popular station in the Forge region and has location ID 60003760. In reality, there may be many spreads for a given type in a given region as different parts of the region may have unique sets of matching orders. Computing proper spreads in this way would also require implementing a proper order matching algorithm which we’ll leave to a later example. For strategies like market making, however, one is normally only concerned with “station spread” which is what we happen to be computing in this example.</p>
<p>We assume you’ve already installed <code>bravado</code> as described in <a href="#example-1---data-Extraction-make-a-graph-of-market-history">Example 1</a>. If you haven’t installed <code>bravado</code>, please do so now. As always, you can follow along with this example by downloading the <a href="code/book/Example_2_Compute_Average_Daily_Spread.ipynb">Jupyter notebook</a>.</p>
<p>The first two cells of this example important standard libraries and configure properties such as <code>type_id</code>, <code>region_id</code>, <code>station_id</code> and <code>compute_date</code> which is set to the timestamp of the first order book snapshot we wish to measure. Note that we use the EveKit library to retrieve an instance of the SDE client:</p>
<figure>
<img src="img/ex2_cell1.PNG" alt="Example Setup" /><figcaption>Example Setup</figcaption>
</figure>
<p>We can use the Orbital Enterprises market data client to extract the first book snapshot:</p>
<figure>
<img src="img/ex2_cell2.PNG" alt="Order Book Snapshot for Tritanium in The Forge at 2017-01-01 00:00 UTC" /><figcaption>Order Book Snapshot for Tritanium in The Forge at 2017-01-01 00:00 UTC</figcaption>
</figure>
<p>Buy and sell orders are conveniently sorted in the result. We use a filter extract these orders by type (e.g. buy or sell) and station ID, then implement a simple spread calculation function to calculate the spread for a set of buys and sells:</p>
<figure>
<img src="img/ex2_cell3.PNG" alt="Sort and Compute Spread" /><figcaption>Sort and Compute Spread</figcaption>
</figure>
<p>Finally, we’re ready to compute spread for all 5-minute snapshots on the target date. We can do this with a simple loop, requesting the next snapshot at each iteration and adding the spread to an array of values which are averaged at the end:</p>
<figure>
<img src="img/ex2_cell4.PNG" alt="Compute Spreads for All Snapshots and Average" /><figcaption>Compute Spreads for All Snapshots and Average</figcaption>
</figure>
<p>And with that, you’ve just computed average daily spread.</p>
<p>As in the first example, we now turn to order book data formats for local storage. You can find order book files for a given day at the URL: <code>https://storage.googleapis.com/evekit_md/YYYY/MM/DD</code>. Three files are relevant for order book data:</p>
<table>
<colgroup>
<col style="width: 26%" />
<col style="width: 73%" />
</colgroup>
<thead>
<tr class="header">
<th style="text-align: left;">File</th>
<th style="text-align: left;">Description</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td style="text-align: left;">interval_YYYYMMDD_5.tgz</td>
<td style="text-align: left;">Order book snapshots for all regions and types for the given day.</td>
</tr>
<tr class="even">
<td style="text-align: left;">interval_YYYYMMDD_5.bulk</td>
<td style="text-align: left;">Order book snapshots in “bulk” form for all regions and types for the given day.</td>
</tr>
<tr class="odd">
<td style="text-align: left;">interval_YYYYMMDD_5.index.gz</td>
<td style="text-align: left;">Order book snapshot bulk file for the given day.</td>
</tr>
</tbody>
</table>
<p>Note that book data files are significantly larger than market history files as they contain every order book snapshot for every type in every region on a given day. At time of writing, a typical book index file is about 100KB which is manageable. However, bulk files are typically 500MB while zipped archives are 250MB. A year of data is about 90GB of storage. By the way, the <code>5</code> in the file name indicates that these are five minute snapshot files. In the future, we may generate snapshots with different intervals. You can easily generate your own sampling frequency using the five minute samples as a source since these these are currently the highest resolution samples available.</p>
<p>The tar’d archive files (e.g. tgz files), when extracted, contain files of the form <code>interval_TYPE_YYYYMMDD_5.book.gz</code> where <code>TYPE</code> is the type ID for which order book snapshots are recorded, and <code>YYYYMMDD</code> is the date on which the snapshots were recorded. The content of each file is slightly more complicated and is explained below. Here is the contents of a sample file:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">wget</span> -q https://storage.googleapis.com/evekit_md/2017/01/01/interval_20170101_5.tgz
$ <span class="kw">ls</span> -lh interval_20170101_5.tgz
<span class="kw">-rw-r--r--+</span> 1 mark_000 mark_000 223M Jan 2 03:43 interval_20170101_5.tgz
$ <span class="kw">tar</span> xvzf index_20170101_5.tgz
<span class="kw">...</span> about 10000 files extracted ...
$ <span class="kw">zcat</span> interval_34_20170101_5.book.gz <span class="kw">|</span> <span class="kw">head</span> -n 10
<span class="kw">34</span>
<span class="kw">288</span>
<span class="kw">10000025</span>
<span class="kw">1483228800000</span>
<span class="kw">10</span>
<span class="kw">12</span>
<span class="kw">4730662577</span>,true,1482974353000,4.85,100000000,1,46444066,station,61000807,30
<span class="kw">4732527006</span>,true,1483117790000,4.50,100000000,1,99774417,station,61000912,90
<span class="kw">4733368217</span>,true,1483178139000,4.45,340000,1,340000,solarsystem,1021334931934,30
<span class="kw">4724371732</span>,true,1482505562000,4.05,10000000,1,4636157,2,61000912,90</code></pre></div>
<p>The first two lines indicate the type contained in the file, in this case Tritanium (type ID 34), and the number of snapshots collected for each region, in this case 288 (a snapshot every five minutes for 24 hours). The remainder of the file organizes snapshots per region and is organized as follows:</p>
<pre><code>FIRST_REGION_ID
FIRST_REGION_FIRST_SNAPSHOT_TIME
FIRST_REGION_FIRST_SNAPSHOT_BUY_ORDER_COUNT
FIRST_REGION_FIRST_SNAPSHOT_SELL_ORDER_COUNT
FIRST_REGION_FIRST_SNAPSHOT_BUY_ORDER
...
FIRST_REGION_FIRST_SNAPSHOT_SELL_ORDER
...
FIRST_REGION_SECOND_SNAPSHOT_TIME
...
SECOND_REGION_ID
...</code></pre>
<p>The columns for each order row are:</p>
<ul>
<li><em>order ID</em> - Unique market order ID.</li>
<li><em>buy</em> - “true” if this order represents a buy, “false” otherwise.</li>
<li><em>issued</em> - Order issue date in milliseconds UTC (since the epoch).</li>
<li><em>price</em> - Order price.</li>
<li><em>volume entered</em> - Volume entered when order was created.</li>
<li><em>min volume</em> - Minimum volume required for each order fill.</li>
<li><em>volume</em> - Current remaining volume to be filled in the order.</li>
<li><em>order range</em> - Order range string. One of “station”, “solarsystem”, “region” or a number representing the number of jobs allowed from the station where the order was entered.</li>
<li><em>location ID</em> - Location ID of station where order was entered.</li>
<li><em>duration</em> - Order duration in days.</li>
</ul>
<p>As with market history, the bulk files are simply the concatenation of the per-type book files together with an index to allow efficient range requests. We can retrieve the same data as above by first consulting the index file:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">curl</span> -s https://storage.googleapis.com/evekit_md/2017/01/01/interval_20170101_5.index.gz <span class="kw">|</span> <span class="kw">zcat</span> <span class="kw">|</span> <span class="kw">head</span> -n 10
<span class="kw">interval_18_20170101_5.book.gz</span> 0
<span class="kw">interval_19_20170101_5.book.gz</span> 143131
<span class="kw">interval_20_20170101_5.book.gz</span> 234988
<span class="kw">interval_21_20170101_5.book.gz</span> 447702
<span class="kw">interval_22_20170101_5.book.gz</span> 522083
<span class="kw">interval_34_20170101_5.book.gz</span> 619717
<span class="kw">interval_35_20170101_5.book.gz</span> 1236236
<span class="kw">interval_36_20170101_5.book.gz</span> 1780447
<span class="kw">interval_37_20170101_5.book.gz</span> 2208243
<span class="kw">interval_38_20170101_5.book.gz</span> 2651627</code></pre></div>
<p>Then sending a range request, in this case to extract bytes 619717 through 1236236 (inclusive):</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">curl</span> -s -H <span class="st">"range: bytes=619717-1236236"</span> https://storage.googleapis.com/evekit_md/2017/01/01/interval_20170101_5.bulk <span class="kw">|</span> <span class="kw">zcat</span> <span class="kw">|</span> <span class="kw">head</span> -n 10
<span class="kw">34</span>
<span class="kw">288</span>
<span class="kw">10000025</span>
<span class="kw">1483228800000</span>
<span class="kw">10</span>
<span class="kw">12</span>
<span class="kw">4730662577</span>,true,1482974353000,4.85,100000000,1,46444066,station,61000807,30
<span class="kw">4732527006</span>,true,1483117790000,4.50,100000000,1,99774417,station,61000912,90
<span class="kw">4733368217</span>,true,1483178139000,4.45,340000,1,340000,solarsystem,1021334931934,30
<span class="kw">4724371732</span>,true,1482505562000,4.05,10000000,1,4636157,2,61000912,90</code></pre></div>
<p>The format of book files is currently optimized for selection by type, which may not be appropriate for all use cases. It is usually best to download the book files you need, and re-organize them according to your use case. The EveKit libraries provide support for basic downloading, including only downloading the types or regions you want.</p>
<p>The second part of the Jupyter Notebook for this example illustrates how to download and compute average spread using the EveKit libraries and Pandas. This can be done in four steps:</p>
<ol type="1">
<li>We first download the order book for Tritanium in the Forge on the target date. By filtering the download by type and region, we can avoid downloading the entire 230MB archive file; the file stored on disk is just 104K.</li>
<li>We next use the OrderBook class to load book data as a Pandas DataFrame. The DataFrame stores each order as a row where the index is the time of the book snapshot where the order was listed. We also add columns for type and region ID to allow for further filtering. Since the index is just snapshot time, we can recover the individual snapshots by grouping on index. Each Pandas group then becomes a book snapshot.</li>
<li>We re-implement our spread computation function to operate on a DataFrame representing a snapshot instead of an array of buys and sells. The computation is the same, except that we return “NaN” in cases where there is no well-defined spread. This is done because the Pandas mean function, which we use in the next step, conveniently ignores NaN values.</li>
<li>We finish by combing Pandas “groupby” and “apply” with our spread computation function to compute a series of spreads, which can then be averaged with Pandas “mean”.</li>
</ol>
<h3 id="example-3---trading-rules-build-a-buy-matching-algorithm">Example 3 - Trading Rules: Build a Buy Matching Algorithm</h3>
<p>As described in the introductory material in this chapter, sell limit orders do not specify a range. Buyers explicitly choose which sell orders they wish to buy from and, if the buyer’s price is at least as large as the seller’s price, then the order will match at the location of the seller (but at the maximum of the buyer’s price and the seller’s price; also, the lowest priced asset at the target station always matches first). When selling at the market, however, the matching rules are more complicated because buy limit orders specify a range. In order to figure out whether two orders match, the location of the buyer and seller must be compared against the range specified in the buyer’s order.</p>
<p>The analysis of more sophisticated trading strategies will eventually require that you determine which orders you can sell to in a given market. Thus, in this example, we show how to implement a “buy order matching” algorithm. Buy matching boils down to determining the distance between the seller and potentially matching buy orders. We show how to use map data from the Static Data Export (SDE) to compute distances between buyers and sellers (or rather, the distance between the solar systems where their stations reside). One added complication is that player-owned structures are not included in the SDE. Instead, a separate data service must be consulted to map a player-owned structure to the solar system where it is located. We show how to use one such service in this example. Finally, we demonstrate the use of our matching algorithm against an order book snapshot. As always, you can follow along with this example by downloading the <a href="code/book/Example_3_Buy_Matching_Algorithm.ipynb">Jupyter Notebook</a>.</p>
<blockquote>
<h3 id="note">NOTE</h3>
<p>This example requires the <code>scipy</code> package. If you’ve installed Anaconda, then you should already have <code>scipy</code>. If not, then you’ll need to install it using your favorite Python package manager.</p>
</blockquote>
<p>Let’s start by looking at a function which determines whether a sell order placed at a particular station can match a given buy order visible at the same station. We need the following information to make this determination:</p>
<ul>
<li><em>region_id</em> - the ID of the region we’re trying to sell in.</li>
<li><em>sell_station_id</em> - the ID of the station where the sell order is placed.</li>
<li><em>buy_station_id</em> - the ID of the station where the buy order is placed.</li>
<li><em>order_range</em> - the order range of the buy order. This will be one of “region”, “solarsystem”, “station”, or the maximum number of jumps allowed between the buyer and seller solar systems.</li>
</ul>
<p>Strictly speaking, the region ID is not required as it can be inferred from station ID. We include the region ID here as a reminder that trades can only occur within a single region: EVE does not currently allow cross-region market trading. Henceforth, unless otherwise stated, we assume the selling and buying stations are within the same region.</p>
<p>With the information above, we can write the following order matching function:</p>
<div class="sourceCode"><pre class="sourceCode python"><code class="sourceCode python"><span class="kw">def</span> order_match(sell_station_id, buy_station_id, order_range):
<span class="co">"""</span>
<span class="co"> Returns true if a sell market order placed at sell_station_id could be matched</span>
<span class="co"> by a buy order at buy_station_id with the given order_range</span>
<span class="co"> """</span>
<span class="co"># Case 1 - "region"</span>
<span class="cf">if</span> order_range <span class="op">==</span> <span class="st">'region'</span>:
<span class="cf">return</span> <span class="va">True</span>
<span class="co"># Case 2 - "station"</span>
<span class="cf">if</span> order_range <span class="op">==</span> <span class="st">'station'</span>:
<span class="cf">return</span> sell_station_id <span class="op">==</span> buy_station_id
<span class="co"># Remaining checks require solar system IDs and distance between solar systems</span>
sell_solar <span class="op">=</span> get_solar_system_id(sell_station_id)
buy_solar <span class="op">=</span> get_solar_system_id(buy_station_id)
<span class="co"># Case 3 - "solarsystem"</span>
<span class="cf">if</span> order_range <span class="op">==</span> <span class="st">'solarsystem'</span>:
<span class="cf">return</span> sell_solar <span class="op">==</span> buy_solar
<span class="co"># Case 4 - check jump range between solar systems</span>
jump_count <span class="op">=</span> compute_jumps(sell_solar, buy_solar)
<span class="cf">return</span> jump_count <span class="op"><=</span> <span class="bu">int</span>(order_range)</code></pre></div>
<p>There are two functions we need to implement to complete our matcher:</p>
<ul>
<li><code>get_solar_system_id</code> - maps a station ID to a solar system ID.</li>
<li><code>compute_jumps</code> - calculates the shortest number of jumps to get from one solar system to another.</li>
</ul>
<p>We can implement most parts of these functions using the SDE. However, if either station is a player-owned structure, then the SDE alone won’t be sufficient. Let’s first assume neither station is player-owned and implement the appropriate functions. For this example, we’ll load our Jupyter notebook with region and station information as in previous examples. We’ll also include type and date information so that we can download an order book snapshot for Tritanium to use for testing:</p>
<figure>
<img src="img/ex3_cell1.PNG" alt="Example Setup" /><figcaption>Example Setup</figcaption>
</figure>
<p>The next cell contains our order matcher, essentially identical to the code above:</p>
<figure>
<img src="img/ex3_cell2.PNG" alt="Buy Order Matching Function" /><figcaption>Buy Order Matching Function</figcaption>
</figure>
<p>Let’s start with the <code>get_solar_system_id</code> function. Since we’re assuming that neither station is a player-owned structure, this function will be just a simple lookup from the SDE:</p>
<figure>
<img src="img/ex3_cell3.PNG" alt="Solar System ID Lookup" /><figcaption>Solar System ID Lookup</figcaption>
</figure>
<p>Implementing the <code>compute_jumps</code> function, however, is a bit more complicated. In order to calculate the minimum number of jumps between a pair of solar systems, we first need to determine which solar systems are adjacent, then we need to compute a minimal path using adjacency relationships. Fortunately, the <code>scipy</code> package provides a library to help solve this straightforward graph theory problem. Our first task is to build an adjacency matrix indicating which solar systems are adjacent (i.e. connected by a jump gate). We start by retrieving all the solar systems in the current region using the SDE:</p>
<figure>
<img src="img/ex3_cell4.PNG" alt="Retrieve All Solar Systems" /><figcaption>Retrieve All Solar Systems</figcaption>
</figure>
<p>The <code>solar_map</code> dictionary will maintain a list of solar system IDs which share a jump gate. The next bit of code populates the dictionary by fetching solar system jump gates from the SDE:</p>
<figure>
<img src="img/ex3_cell5.PNG" alt="Populating Solar System Adjacency" /><figcaption>Populating Solar System Adjacency</figcaption>
</figure>
<p>With adjacency determined, we’re now ready to build an adjacency matrix. An adjacency matrix is a square matrix with dimension equal to the number of solar systems, where the value at location (source, destination) is set to 1 if source and destination share a jump gate, and 0 otherwise. Once we’ve created our adjacency matrix, we use it to initialize a <code>scipy</code> matrix object needed for the next step:</p>
<figure>
<img src="img/ex3_cell6.PNG" alt="Construct Adjacency Matrix" /><figcaption>Construct Adjacency Matrix</figcaption>
</figure>
<p>The last step is to call the appropriate <code>scipy</code> function to build a shortest paths matrix from the adjacency matrix. The result is a matrix where the value at location (source, destination) is the number of solar system jumps required to move from source to destination:</p>
<figure>
<img src="img/ex3_cell7.PNG" alt="Construct Shortest Path Matrix" /><figcaption>Construct Shortest Path Matrix</figcaption>
</figure>
<p>With the shortest path matrix complete, we can now implement the <code>compute_jumps</code> function:</p>
<figure>
<img src="img/ex3_cell8.PNG" alt="Compute Jumps Function" /><figcaption>Compute Jumps Function</figcaption>
</figure>
<p>The Jupyter notebook includes a few simple tests to show that this function is working properly. Now that our basic matching algorithm is complete, we can test it on the book snapshot we extracted. In this case, we’ll test which buy orders could potentially match a sell order placed at our target station. This can be done with a simple loop:</p>
<figure>
<img src="img/ex3_cell9.PNG" alt="Finding Matchable Buy Orders" /><figcaption>Finding Matchable Buy Orders</figcaption>
</figure>
<p>Although we’ve found several matches, note that there are several orders for which the solar system ID can not be determined. This is because these orders have been placed at a player-owned structure. Another way you can tell this is the case is by looking at the location ID for these orders. Location IDs greater than 1,000,000,000,000 (1 trillion) are generally player-owned structures. Let’s now turn our attention to resolving the solar system ID for player-owned structures. The CCP supported mechanism is to use the <a href="https://esi.tech.ccp.is/latest/#!/Universe/get_universe_structures_structure_id">Universe Structures ESI Endpoint</a>. This endpoint returns location information for a player-owned structure if your authenticated account is authorized to access that structure. If your account is <em>not</em> authorized to access a given structure, then you can’t view location information, <em>even</em> if the buy orders placed from the structure appear in the public market. This is a somewhat inconvenient inconsistency in EVE’s market rules, but fortunately there are third party sites which can be used to discover the location of otherwise inaccessible player-owned structures. We use one such site in this example, primarily because it doesn’t require authentication and setting up proper authentication to use the supported ESI endpoint is beyond the scope of this example.</p>
<p>The third party site we’ll use in this example is the <a href="https://stop.hammerti.me.uk/api/">Citadel API</a> site, which uses a combination of the ESI and crowd-sourced reporting to track information about player-owned structures. This site provides a very simple API for retrieving structure information based on structure ID. You can create a client for this site using the EveKit libraries:</p>
<figure>
<img src="img/ex3_cell10.PNG" alt="Using the Citadel API to look up structure information" /><figcaption>Using the Citadel API to look up structure information</figcaption>
</figure>
<p>The relevant information for our purposes is <code>systemId</code> which is the solar system ID. With this service, we can implement an improved <code>get_solar_system_id</code>:</p>
<figure>
<img src="img/ex3_cell11.PNG" alt="Improved solar system ID lookup" /><figcaption>Improved solar system ID lookup</figcaption>
</figure>
<p>which fixes any missing solar systems when we attempt to match orders in our snapshot:</p>
<figure>
<img src="img/ex3_cell12.PNG" alt="Proper matches now that all solar systems are resolved" /><figcaption>Proper matches now that all solar systems are resolved</figcaption>
</figure>
<p>And with that, we’ve implemented our buy order matcher.</p>
<p>As currently implemented, our matcher makes frequent calls to the SDE which can be inefficient for analyzing large amounts of data. The remainder of the Jupyter notebook for this example describes library support for caching map information for frequent access. We end the example with a convenient library function that implements our buy matcher in it’s entirety, including resolving solar system information from alternate sources.</p>
<h3 id="example-4---unpublished-data-build-a-trade-heuristic">Example 4 - Unpublished Data: Build a Trade Heuristic</h3>
<p>The CCP provided EVE market data endpoints provide quote and aggregated trade information. For some trading strategies (e.g. market making), finer grained detail is often required. For example, which trades matched a buy market order versus a sell market order? What time of day do most trades occur? Because CCP does not yet provide individual trade information, we’re left to infer trade activity ourselves. In some cases, we can deduce trades based on changes to existing marker orders, as long as those orders are not completely filled (i.e. appear in the next order book snapshot). Orders which are removed, however, could either be canceled or completely filled by a trade. As a result, we’re left to use heuristics to infer trading behavior.</p>
<p>In this example, we develop a simple trade inference heuristic. This will be our first taste of the type of analysis we’ll perform many times in later chapters in the book. Specifically, we’ll need to derive a hypotheses to explain some market behavior; we’ll need to do some basic testing to convince ourselves we’re on the right track; then, we’ll need to perform a back test over historical data to confirm the validity of our hypothesis. Of course, performing well in a back test is no guarantee of future results, and back tests themselves can be misused (e.g. over-fitting). A discussion of proper back testing is beyond the scope of this example. We’ll touch on this topic as needed in later chapters (there are also numerous external sources which discuss the topic).</p>
<p>We’ll use a day of order book snapshots for Tritanium in The Forge to test our heuristic. This example dives more deeply into analysis than previous examples. We’ll find that the “obvious” choice for estimating trades does not work very well, and we’ll briefly discuss a hypothesis on how to make a better choice. We’ll show how to perform a basic analysis of our hypothesis, then show a simple back test evaluating our strategy. You can follow along with this example by downloading the <a href="code/book/Example_4_Trade_Heuristic.ipynb">Jupyter Notebook</a>.</p>
<p>Earlier in this chapter we noted that one problem with order book data is that CCP’s endpoint sometimes omits orders leading to gaps. Since trades are inferred from order book snapshots, we first need to deal with the gapping problem. Such issues are not uncommon in the real world of data science. Fortunately, we can fix most of these problems although we don’t have enough information to claim we’ve fixed all such gaps. Once we’ve applied our fix to the data, we continue with our analysis on trade estimation.</p>
<p>The nature of the gapping problem is that, occasionally, orders will disappear from order book snapshots, only to re-appear again in a later snapshot. This can confuse our trade heuristic which attempts to infer trades by looking at differences between subsequent snapshots. We can illustrate this problem by looking for orders with this behavior. In fact, we don’t have to look much further than the first snapshot in this particular example. The following code finds orders which are missing from some snapshots:</p>
<figure>
<img src="img/ex4_cell1.PNG" alt="Finding Missing Orders" /><figcaption>Finding Missing Orders</figcaption>
</figure>
<p>If we take a look at the first order we found (e.g. 4720076544), we can verify this order is missing by checking the issue date:</p>
<figure>
<img src="img/ex4_cell2.PNG" alt="Missing Order Issue Date" /><figcaption>Missing Order Issue Date</figcaption>
</figure>
<p>The issue date is clearly before our target date so this order should definitely be in the first snapshot. In fact, in turns out this order is missing from many snapshots:</p>
<figure>
<img src="img/ex4_cell3.PNG" alt="Order Gaps" /><figcaption>Order Gaps</figcaption>
</figure>
<p>To fix this problem, we’ve added a “fill_gaps” option to the EveKit order book loader which will backfill gaps when it’s clear an order is missing. When the order loader detects a gap, it works backwards from the snapshot where the gap was detected, inserting the order into any missing snapshot until it finds a snapshot with a timestamp before the issue date of the order, or it finds a snapshot where the order already exists.</p>
<p>If we reload the order book, this time with <code>fill_gaps=True</code>, we see that all gaps have been repaired:</p>
<figure>
<img src="img/ex4_cell4.PNG" alt="Order Gaps Fixed" /><figcaption>Order Gaps Fixed</figcaption>
</figure>
<p>Henceforth, we’ll use the “fill_gaps” feature any time having gap free data is important. Let’s move now to inferring trades from the order book.</p>
<p>A quick review of EVE market mechanics tells us that once an order is placed, it can only be changed in the following ways:</p>
<ul>
<li>The price can be changed. Changing price also resets the issue date of the order.</li>
<li>The order can be canceled. This removes the order from the order book.</li>
<li>The order can be partially filled. This reduces volume for the order, but otherwise the order remains in the order book.</li>
<li>The order can be completely filled. This removes the order from the order book.</li>
</ul>
<p>Since a partially filled order is the only unambiguous indication of a trade, let’s start by building our heuristic to catch those events. The following function does just that:</p>
<figure>
<img src="img/ex4_cell5.PNG" alt="Initial Trade Heuristic" /><figcaption>Initial Trade Heuristic</figcaption>
</figure>
<p>This function reports the set of inferred trades as a DataFrame:</p>
<figure>
<img src="img/ex4_cell6.PNG" alt="Partial Fill Trades" /><figcaption>Partial Fill Trades</figcaption>
</figure>
<p>Note that the trade price may not be correct as market orders only guarantee a minimum price (in the case of a sell), or a maximum price (in the case of a buy). The actual price of an order depends on the price of the matching order and could be higher or lower. Note also that we can only be certain of location for sell orders since these always transact at the location of the seller, <em>unless</em> a buy order happens to list a range of <code>station</code>.</p>
<p>The best way to test our heuristic is to compute trades for a day where market history is also available. We’ve done that in this example so that we can load the relevant market history and compare results. From that comparison, we see that partial fills only account for a fraction of the volume for our target day:</p>
<figure>
<img src="img/ex4_cell7.PNG" alt="Difference Between Trade Heuristic and Historic Data" /><figcaption>Difference Between Trade Heuristic and Historic Data</figcaption>
</figure>
<p>In this example, partial fills account for about 40% of the daily volume for this day. That leaves us to estimate complete fills, but it also tells us this is an important estimate as more than half of this day’s trading volume came from complete fills. There are many ways we can estimate complete fills, but a simple strategy is to start with the naive approach of assuming any order which is removed between book snapshots must be a completed fill. We know this will rarely be correct, but it is possible that the number of removed orders which are actually cancels is small enough to not be significant. Let’s update our trade heuristic to capture these fills in addition to the partial fills we already capture:</p>
<figure>
<img src="img/ex4_cell8.PNG" alt="Naive Capture of Complete Fills" /><figcaption>Naive Capture of Complete Fills</figcaption>
</figure>
<p>How does this version compare?</p>
<figure>
<img src="img/ex4_cell9.PNG" alt="Naive Capture Results" /><figcaption>Naive Capture Results</figcaption>
</figure>
<p>The naive approach overshoots volume by almost 200%. This tells us that in fact a significant amount of removed order volume is due to cancels (or expiry) instead of complete fills. The results also show that we’re only able to recover about 30% of the trades for the day. This is important because the naive algorithm captures all possible trades visible in the data and yet still misses a significant number (by count). This tells us that a substantial number of trades are occurring between snapshots. If these trades are partial fills, then we’re already capturing the volume but we have no mechanism to capture the individual trades. It’s also possible that limit orders are being placed and completely filled between snapshots. We have no way to capture these trades as they are not visible in the data. Given the short duration between snapshots (5 minutes at time of writing), it seems unlikely we’re missing very short lived limit orders.</p>
<p>Since we can’t match trade count, we’ll instead focus on trying to more closely match volume for the day. We know that some removed orders must be cancels and not complete fills. Perhaps this is related to volume. Let’s take a look at a histogram of the volume of the data from inferred trades (i.e. trades which are either complete fills or cancels):</p>
<figure>
<img src="img/ex4_cell10.PNG" alt="Histogram of Inferred Trades" /><figcaption>Histogram of Inferred Trades</figcaption>
</figure>
<p>From this plot, we can see that there are very clear outliers. In fact, it looks like trades with volume above 500 million may actually be cancels instead of trades. Why would we draw this conclusion? Because trades this large would represent a substantial portion of daily volume. In fact, by doing a simple calculation in the next cell, we see that the sum of the volume of trades above 500 million accounts for about 80% of total daiily volume. Morevoer, <em>removing</em> these trades from our set of inferred trades gives us an inferred volume very close to actual volume.</p>
<p>This analysis suggests a very simple strategy for distinguishing complete fills from cancels:</p>
<ol type="1">
<li>if an inferred trade has volume less than some threshold, it’s a complete fill.</li>
<li>otherwise, the trade is actually a cancel.</li>
</ol>
<p>If we adopt this strategy, what value should we use for a threshold? We can’t conclude that 500 million will always be an appropriate threshold. For one thing, each asset type will almost certainly have a different threshold. For another, as volume changes daily, it’s likely the appropriate threshold will change daily as well. A better choice might be to base our threshold on a percentage of daily volume. That way, our threshold will adjust as volume changes over time. If we arbitrarily choose our threshold for this example as a starting point, then our target threshold is approximatley 4%. We don’t have much data to suggest that 4% of daily volume is the right threshold. But, for the sake of completing this example, let’s assume this is the correct ratio. This is our final version of our trade inference function which treats orders above a certain volume as cancels:</p>
<figure>
<img src="img/ex4_cell11.PNG" alt="Final Version of Trade Heuristic" /><figcaption>Final Version of Trade Heuristic</figcaption>
</figure>
<p>We’ll now turn to back testing our new strategy. A “back test” is simply an evaluation of an algorithm over some period of historical data. For this example, we’ll test our strategy over the thirty days prior to our original test date. The example <a href="code/book/Example_4_Trade_Heuristic.ipynb">Jupyter Notebook</a> provides cells you can evaluate to download sufficient market data to local storage. We strongly recommend you do this as book data will take significantly longer to fetch on demand over a network connection.</p>
<p>Since we may want to be able to infer trades on a day for which we don’t yet have historic data (e.g. the current day), we’ll set the volume threshold for the current day to be 4% of the average volume for the preceeding five days (i.e. five day moving average of daily volume). Market making, discussed in a later chapter, is an example strategy where it’s important to be able to infer trades before historic data is tabulated for the day. Now that we have all required data, our back test is then a simple iteration over the appropriate date range:</p>
<figure>
<img src="img/ex4_cell12.PNG" alt="Back Test Loop" /><figcaption>Back Test Loop</figcaption>
</figure>
<p>We capture the results in a DataFrame for further analysis:</p>
<figure>
<img src="img/ex4_cell13.PNG" alt="Back Test Results" /><figcaption>Back Test Results</figcaption>
</figure>
<p>We can then view the results of our test comparing inferred trade count and volume with historic values on the same days. The following graphs show the results of this comparison (values near zero are better):</p>
<p><img src="img/ex4_cell14.PNG" alt="Inferred Count vs. Historic Count as Percentage" /> <img src="img/ex4_cell15.PNG" alt="Inferred Volume vs. Historic Volume as Percentage" /></p>
<p>We know that we are unlikely to capture trade count accurately and the first plot confirms those results. However, the volume plot is surprisingly good, with many days within 20% of actual which is likely good enough for our use. If computing trades before history is available is less important, then another strategy would be to sort inferred trades descending by volume and iteratively remove large trades until within some set threshold of actual historic volume. We leave that variant as an exercise for the reader.</p>
<p>The EveKit libraries do not include any explcit support for trade analysis such as we described above. The highly heuristic nature of this analsyis makes it difficult to provide a standard offering. As we’ll see in later chapters, the basic analysis above can be adapted to the specific needs of a particular trading strategy.</p>
<h3 id="example-5---important-concepts-build-a-liquidity-filter">Example 5 - Important Concepts: Build a Liquidity Filter</h3>
<p>Thousands of asset types are traded in EVE’s markets. However, as in real-world markets, the frequency of trades and the price range for an asset varies widely according to type. Trading strategies usually prefer <em>liquid</em> asset types, which are those asset types that can be bought or sold at relatively stable prices. Such asset types are more amenable to analysis and are typically easier to buy and sell as needed in the market. Real-world traders use liquidity as one of the prime measures for admitting or excluding asset types from their tradable portfolio. There are many factors that lead to price stability, but often the most important factor is daily volume. Assets which are traded daily at reasonable volume are more likely to have rich pools of buyers and sellers, and prices are more likely to converge to a stable range. Additional criteria, such as having roughly balanced buyers and sellers, may also be important.</p>
<p>In this example, we derive a simple liquidity filter. This is one of the simplest of our preliminary examples, but also one of the most important as a well chosen asset portfolio is key to many strategies. We focus our efforts on the general framework for testing liquidity. This framework is pluggable, allowing different filters to be inserted as needed. We create two example filters to illustrate how to use the framework. You can follow along with this example by downloading the <a href="code/book/Example_5_Liquidity_Filter.ipynb">Jupyter Notebook</a>.</p>
<p>For this example, we’ll look for liquid types in The Forge region and we’ll choose to measure liquidity over the 90 days of market snapshots leading up to our reference date. The choice of date range depends on your trading strategy: if your strategy expects recently liquid assets to remain liquid well into the future, then you should select a longer historical date range; if, instead, you only require assets to remain liquid for the next day or so, then you can select a much shorter historical range. Likewise, we could also create a liquidity filter based on order book data instead of market snapshots. This is usually not necessary unless your strategy calls for analyzing liquidity at certain times of day (some market making strategies may find this information useful). The first cell in our notebook sets up are initial parameters:</p>
<figure>
<img src="img/ex5_cell1.PNG" alt="Liquidity Filter Setup" /><figcaption>Liquidity Filter Setup</figcaption>
</figure>
<p>If you plan to test and iterate over different liquidity filters, you’ll almost certainly want to download market history for the range in question. The next notebook cell does just that. Make sure you are on a reasonably fast network connection before you execute this cell:</p>
<figure>
<img src="img/ex5_cell2.PNG" alt="Liquidity History Download" /><figcaption>Liquidity History Download</figcaption>