Market Data Unfair Advantage Posting Status Robert 2016

Commercial Client FAQ

General

What encoding is used for CSI’s data files?

We currently use Windows-1252 encoding for all data files.

Can a div and a split can occur on the same exDate ?

Yes. It should be noted that when adjusting history for splits and dividends, the split adjustment is applied first, and then the dividend adjustments.

Dividends, is the currency is always US Dollars? Even in the cases of ADR ?

Yes.

I saw a settlement price that was not in the high-low range.

It is perfectly acceptable to have a settlement outside of the high-low range. The close field is usually the settlement price, not the last trade price (except for those CSI numbers designated as having the last as close), and there is no rule that the settlement must be in the high low range. It rarely happens with active months but is not uncommon for lightly traded months.

Is the csinum a unique identifier and will it ever be reused?

CSI will almost never reassign the number assigned to an issue. That is, you will never see the CSI number 5159 used for anything other than IBM or used for anything other than that company or companies related to IBM or in case of a take over or spin off. The only exception to this rule is when an issue starts trading so low (below 0.01 or in some cases 0.001)
that we are forced to remove it from our system. Sometimes these penny stocks reverse split and change symbols and we do not pick up on the fact that it was a previously covered security. When this happens, you will have received a D (deletion) record for the issue on the day is was removed from our system, and you will receive either an A (add) record if it is re-added to our system under a new number or an M (modification) record if it is recognized as a previously covered issue and reassigned to the same number as it was before.

Factsheet

Can we get an asset type (common stock, preferred stock, ADR, etc) with the fact sheet data?For specific exchange and asset types, see the stock csv factsheet document which is kept live. There are fields for common stock, preferred stock, depository receipt, etc. found here:Stock factsheet (csv)The web version is here: Stock factsheet (html)

And there are factsheet for all commodities, options, forex, etc here:

Factsheets for all CSI data

The historic us stocks do not include exchange flags for the NASDAQ exchange. Will we be getting data for that exchange?The Nasdaq stocks are mixed in with the “OTC” exchange.

Daily Data Files

What are the fields in the daily files?

View the csv daily format specifications.

There seems to be a pattern in the ordering of the daily update files: file header (type 00), fact sheet modifications (type 11), error corrections (type 09), records with pricing, splits, dividends, capital gains (Types 01,32,33,34,35,36,37,08), then file trailer (type 00). Will this pattern remain consistent and in what order should they be processed?

The order will remain consistent. The 00 will always be the first and last records. This should be used to verify the whole file is there. The file should be processed from start to end. Any 11 factsheet records should be processed first. Then all 09 corrections and 13 deletions. Then all the normal price data records.

What are correction records and how are they handled?

Corrections (type 09) are sent whenever the normal data records for a previous day were incorrect. The reason they were initially incorrect could have been an exchange error or a human error by us. The corrections should overwrite and replace all fields of the records in your local history. Corrections are available for stock/future/option price records, splits, dividends, and capital gains. Corrections for futures and options include current day volume and open interest.

What are deletion records and how are they handled?

When you see a type 13 “delete past day of data” record, you should delete the entire day of data from your table. This means that the original record was sent in error, and no trading actually happened on that day. For futures, the delivery month is given. If that field is 0000, then you should delete any V/OI header records, as well as any 54 cash records.

In the CSI Comma-Separated Values Format document type 00: Will the value for Field# 1 be consistent for each type of data? (i.e., will the index daily updates always have “G54” for this field?)

Yes. Below are the pre-set stock identifiers:

****     Type 1 - New York                     G01        ****
****     Type 2 - Amex                         G02        ****
****     Type 3 - Nasdaq                       G03        ****
****     Type 4 - Indices                      G04        ****
****     Type 5 - Canadian                     G05        ****
****     Type 6 - London                       G06        ****
****     Type 7 - Mutuals                      G07        ****
****     Type 8 - US Stocks (no funds)         GL9        ****
****     Type 9 - All except London            G99        **** 
****     Type 10- Foreign Indices              G08        ****
****                                                      ****
****   Decimal Types -                                    ****
****     Type 11- New York                     G51        ****
****     Type 12- Amex                         G52        ****
****     Type 13- Nasdaq                       G53        ****
****     Type 14- Indices                      G54        ****
****     Type 15- Canadian                     G55        ****
****     Type 16- London                       G56        ****
****     Type 17- Mutuals                      G57        ****
****     Type 18- Foreign Indices              G58        ****
****     Type 19- All (except London)          G59        **** 
****     Type 20- US Stocks (no funds)         G60        **** 

Field# 2 of record type 00 says that ‘1’ represents a daily file type. Will we get other file types? If so, what are the other file types?

There are currently no other file types that use 00 records.

Is field# 4 the default date for the records in the file?

Yes.

In the CSI Comma-Separated Values Format document, type 11:a. What does it mean that Field# 9 is “Reserved”? I have not seen any values for it.

It is currently unused and can be ignored.

Why are Fields 10 – 13 “Pending”? Will there eventually be values for those fields?

We do not have plans currently.

In the CSI Comma-Separated Values Format document, type 36, field# 4 is described as the “asking price (NAV plus commission).” Are there examples where this value is different from the NAV? Is this field included in the mutual fund historic data?

This is an old feature, and we no longer provide any Nav plus commission values in daily nor history.

Type 09, assuming we update price data for a former date (type 33), while the security already been through a div or a split event in a more advance date, will the price data will be adjusted or not ? Will there be any indication that it should be adjusted ?

The corrections prices are all unadjusted.

How do you handle a Div / split given on a wrong date?

We send a 1:1 split correction to delete incorrect splits. We send a 0 dividend correction to delete an incorrect dividend or capital gain.

For type 11 Modifications – what kind of modification I should expect and check for?

Name changes, exchange changes, conversion factor changes.

On the day type 11 has occurred for a certain symbol, could a type 33 record be posted that day as well?

Yes.

For type 11 stock delistings, the ‘D’ records are made 1 day after the last update date, it is always correct?

Not always. We try to provide them one day after. But sometimes we don’t send them until a later date when the information is provided to us or our checking procedure detects them.

What should I assume the open interest date + volume date to be for correction records of inner type 32? Presumably I should take these dates from the 01 record when they are available, but in the case where the 01 record is missing or doesn’t have the dates I suppose I should assume that the open interest date + volume date are just equal to the correction date?

That is correct. Commodity corrections always use the correction date for the volume and open interest date.

I am getting a correction with a date prior to the start of my history.

You will receive all corrections generated by our system regardless of whether your history goes back that far. If it is before the start of your database you can just disregard the records.

Record types 07 and 08 (dividends and splits) come with a Ex-dividend date. Under what circumstances can this be different from the date of the daily update that comes in the header?Only when receiving record types 07 and 08 via a correction record. If it is not a correction record, dividends and splits are always for the date of the daily update.

I am getting a lot of 09 records (error correction) – what is the average number of errors on a daily basis?

Unfortunately lots of stocks continue to accumulate volume up to 7PM, after the CSI stock posting. If volume differs significantly during this time, CSI will issue a correction, so the correction count can be a few thousand per day. Again, the vast majority of these are for volume adjustments.

When a commodity 32 correction record is sent for the previous business day, it will sometimes be 0 when it shouldn’t be.

When the correction record is for the previous business day, the volume/oi may or may not be filled in, depending on whether the volume and oi for the prior day has been entered into the CSI database at the time the correction is made by CSI personnel. Some issues do not have v/oi filled in for the prior business day until late in the current business day.The advice to apply the correction records first then the normal data records should be followed. That will ensure that regardless of whether the v/oi is available at the time of correction, the later overwrite by the regular data record will be correct.

How do I handle factsheet records?

Its recommended you have a factsheet table to reference the symbols, names, exchanges, and status. When you see an A or M record, you can insert and replace into your factsheet table all given fields, along with status column ‘A’ active. If you see a D (delete) record, you can simply change the status to ‘D’. It is very common for stocks to change symbols, names, and exchanges. Since you are getting decimal values, you can ignore any “conversion factor” fields that you come across.

The header contains today’s date and a default date. So the data in the file is associated with today’s date or the default date?
00,LMS,1,20100,20160121,4,20160120,20160120

In the 00 type record, the first date is the date for all OHLC prices in the file. The second and third dates are the default dates of the volume and open interest, respectfully, for each futures market in the file. However, if the 01 type record for a given future contains volume and OI date fields, then those dates will override the default dates for that future.

What date does the volume and OI apply to for futures and options?

The volume and open interest fields have the official values for the previous trading day (T-1). The total estimated volume always applies to the T current day date of the daily file. Therefore the estimated total volume is available one day before the official volumes are. The dates for the volume and OI are specified in the type 01 headers. If those fields are blank, then the dates are specified in the type 00 header.

Type 00. The file header and trailer records.
Field# 1 Portfolio identifier (alphanumeric).
Field# 2 File type flag. ‘1’ for daily.
Field# 3 Record count, including header and trailer records.
Field# 4 Today’s date in CCYYMMDD format.
Field# 5 Day number of week. 1=Monday, 2=Tuesday…7=Sunday.
Field# 6 Default volume date in CCYYMMDD format for all futures and options.
Field# 7 Default open interest date in CCYYMMDD format for all futures and options.

Type 01. Commodity/commodity option/stock option header record.
Field# 1 Symbol.
Field# 2 CSI commodity number.
Field# 3 Option type flag. 0=non-option, 2=put, 3=call.
Field# 4 Total volume for the previous trading day.
Field# 5 Total open interest for the previous trading day.
Field# 6 Total estimated volume for the current day.
Field# 7 Volume date, if different from the default volume date in Type 00.
Field# 8 Open interest date, if different from the default open interest date in Type 00.

Type 32. Commodity future contract record.
Field# 1 Symbol.
Field# 2 CSI commodity number.
Field# 3 Delivery year and delivery month, in YYMM format.
Field# 4 Open price.
Field# 5 Unused.
Field# 6 High price.*
Field# 7 Low price.*
Field# 8 Close price. It can be settlement or last depending on the CSI number.
Field# 9 Previous day settle price.
Field# 10 Volume for the previous trading day, for the date in associated Type 01 (or Type 00 if the Type 01 date is blank).
Field# 11 Open interest for the previous trading day, for the date in associated Type 01 (or Type 00 if the Type 01 date is blank).

For example, in this 20201215 daily file, the volume and OI dates for #2 LC are for 20201214.

00,EXP,1,20158,20201215,2,20201214,20201214
01,LC,2,0,41055,284374,30230
32,LC,2,2012,108.8,,109.325,108.4,108.6,109.1,1016,7529
32,LC,2,2102,112.925,,113.35,112.6,112.875,113.1,16305,113775
32,LC,2,2104,117,,117.6,116.825,117.125,117.375,10041,72370
32,LC,2,2106,112.3,,112.9,112.2,112.7,112.6,8521,61494
32,LC,2,2108,111.8,,112.325,111.7,112.15,112.025,3399,15589
32,LC,2,2110,115,,115.35,114.625,115.15,115.05,1389,9206
32,LC,2,2112,117,,117.75,117,117.525,117.5,244,2992
32,LC,2,2202,119.2,,119.425,119.05,119.25,119.25,80,1034
32,LC,2,2204,119.55,,119.85,119.55,119.85,119.6,60,385

For futures, which records should be read?

The 00 and 01 records should be read to get the volume and open interest dates. The 32 records contain the price data. The 09 and 13 records should be processed and applied to your stored history.

For options, which records should be read?

The 00 and 01 records should be read to get the volume and open interest dates. The 34 records contain the price data. The 09 and 13 records should be processed and applied to your stored history.

What are the differences between the old and new daily file formats?
  • New records are included:
    • #11 commodity factsheet corrections for additions/deletions.
    • #16 footnote corrections.
    • Options price corrections.
  • Strikes are displayed as decimal instead of points format.

How could we track ticker changes on our side for historical data? The daily file includes #11 factsheet changes. How should we use such information? Do we change the tickers of the historical data? I’m concerned that this may cause some issues.

To solve this issue, we use an internal code called the CSI number. It is an integer. We store the meta data for stocks in our stock “factsheet”. It contains the CSI numbers along with the current names, tickers, and exchanges. When a listing changes tickers, names, or exchanges we send a #11 record in the daily file which provides the new values. You should maintain a local copy of the meta data and update the values based on the #11 records. When referencing stocks from our daily and history files you should use CSI number as the identifier and not the ticker. When needed the tickers can be looked up in the meta data.

Stock factsheet (html format with search)

Stock factsheet (csv format)

Historical Data Files

In the historic data, what is the purpose of the two columns of symbols? Why do the daily updates for the fact sheets only include one (current) symbol?

The first symbol is the symbol of the filename for each file, and the second symbol is the actual symbol that CSI uses. Some adjustments had to be made for filename structure.

The Mutual Fund historic symbols file has “FUND” listed as the exchange for all of the mutual funds, but the daily updates from 11/20/15 and 11/23/15 have “MUTUAL” as the exchange code. What is the difference?

They are equivalent.

Should I store the data unadjusted in my database? How do I adjust the history for splits and dividends?

We recommended getting and storing the stock history unadjusted, because all price and volume corrections will be sent unadjusted. You should also store all splits and dividends. When you wish to see adjusted data, you need to do the adjusting in your business logic. You should iterate backwards in time, and starting on the day prior to each split exdate you should multiply the price by the split adjustment (oldshares/newshares) and the volume by the inverse (newshares/oldshares). You should store an adjustment factor as you go back in time, and apply each split that you see. Also, if desired, you can adjust for dividends by subtracting the dividends on the day before each dividend exdate. You should store a dividend total as you go back in time, to be subtracted. For each day, the split adjustment is always applied first, then the dividend adjustment.