​A selection of topics on IT and its application to finance. 
​Send me your comments, questions ​or suggestions by clicking
h​ere


elmama
A real-life example of retrieving XML data from the web and storing it as regular relational-type row and column data in an Oracle table.

 

This example shows many useful techniques including  XML data processing in Oracle, downloading XML data from the web and storing XML data as regular columns/rows a’ la relational style. I was tasked with, on a daily basis, storing a series of XML data files residing on Markit Partner’s web-site into out Oracle database. The data was ISDA interest rate swap curve data and it was to be stored in a regular database table in ordinary row/column relational data  formati.e not as XML. 


There were 10 data files to download in total and they all have the same format. Each of the files followed the same naming convention.


e.g    InterestRates_XXX_YYYYMMDD.xml

Where XXX was the 3 letter currency mnemonic and YYYYMMDD was the year/month/day.


Below is an example of what the file for the Canadian dollar looks like. There are similar data files for EUR, NZD, GBP, JPY, SGD, CHF, HKD, AUD, and USD currencies. All the data files followed the same naming convention


Example of type of data to be processed/stored


<?xml version="1.0" standalone="yes" ?>

-<interestRateCurve>

 <effectiveasof>2014-02-07</effectiveasof>

 <currency>CAD</currency>

 <baddayconvention>M</baddayconvention>

-<deposits>

 <daycountconvention>ACT/365</daycountconvention>

 <snaptime>2014-02-06T21:00:00.000Z</snaptime>

 <spotdate>2014-02-07</spotdate>

-<calendars>

 <calendar>none</calendar>

 </calendars>

-<curvepoint>

 <tenor>1M</tenor>

 <maturitydate>2014-03-07</maturitydate>

 <parrate>0.0122</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>2M</tenor>

 <maturitydate>2014-04-07</maturitydate>

 <parrate>0.01246</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>3M</tenor>

 <maturitydate>2014-05-07</maturitydate>

 <parrate>0.01266</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>6M</tenor>

 <maturitydate>2014-08-07</maturitydate>

 <parrate>0.01339</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>1Y</tenor>

 <maturitydate>2015-02-09</maturitydate>

 <parrate>0.01444</parrate>

 </curvepoint>

 </deposits>

-<swaps>

 <fixeddaycountconvention>ACT/365</fixeddaycountconvention>

 <floatingdaycountconvention>ACT/365</floatingdaycountconvention>

 <fixedpaymentfrequency>6M</fixedpaymentfrequency>

 <floatingpaymentfrequency>3M</floatingpaymentfrequency>

 <snaptime>2014-02-06T21:00:00.000Z</snaptime>

 <spotdate>2014-02-07</spotdate>

-<calendars>

 <calendar>none</calendar>

 </calendars>

-<curvepoint>

 <tenor>2Y</tenor>

 <maturitydate>2016-02-07</maturitydate>

 <parrate>0.01291</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>3Y</tenor>

 <maturitydate>2017-02-07</maturitydate>

 <parrate>0.01495</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>4Y</tenor>

 <maturitydate>2018-02-07</maturitydate>

 <parrate>0.0175</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>5Y</tenor>

 <maturitydate>2019-02-07</maturitydate>

 <parrate>0.01991</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>6Y</tenor>

 <maturitydate>2020-02-07</maturitydate>

 <parrate>0.022055</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>7Y</tenor>

 <maturitydate>2021-02-07</maturitydate>

 <parrate>0.02406</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>8Y</tenor>

 <maturitydate>2022-02-07</maturitydate>

 <parrate>0.02587</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>9Y</tenor>

 <maturitydate>2023-02-07</maturitydate>

 <parrate>0.02744</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>10Y</tenor>

 <maturitydate>2024-02-07</maturitydate>

 <parrate>0.028845</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>15Y</span>tenor>

 <maturitydate>2029-02-07</maturitydate>

 <parrate>0.03347</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>20Y</tenor>

 <maturitydate>2034-02-07</maturitydate>

 <parrate>0.034965</parrate>

 </curvepoint>

-<curvepoint>

 <tenor>30Y</tenor>

 <maturitydate>2044-02-07</maturitydate>

 <parrate>0.03467</parrate>

 </curvepoint>

 </swaps>

 </interestRateCurve>

 

The data we required to store is defined by the following Oracle table definition

desc  isda_curves

Name                                               Null?          Type

-------------------------------- --------    ----------------------------

 CURVENAME                                    NOT NULL  VARCHAR2(20)

 CURVEDATE                                     NOT NULL  DATE

 CURVECURRENCY                             NOT NULL  VARCHAR2(8)

 TYPE                                               NOT NULL  VARCHAR2(8)

 IDENT                                             NOT NULL  VARCHAR2(20)

 MID                                                                 NUMBER

 MATURITY                                                        DATE

 SPOTDATE                                                        DATE

 FIXED_BASIS                                                      VARCHAR2(8)

 FLOAT_BASIS                                                     VARCHAR2(8)

 FIXED_FREQUENCY                                             VARCHAR2(4)

 FLOAT_FREQUENCY                                            VARCHAR2(4)

 

The first thing I needed to do was to create a staging table where I could temporarily store the raw XML data I got from the web. This was straightforward

 

e.g

create table isda_stage (xml_dataxmltype)

 

Next I had to get the 10 data files from Markit’s web-site. We use CURL for this which is kind of like a cross between FTP and HTTP protocols.

 

NB You will need a valid username and password before you can download this data from Markit’s site

 

e.g

 

$ curl --verbose --insecure  -

--output isda_usd.zip --form user=your_username_here -

--form password=your_password_here --form format=zip -

https://www.markit.com/news/InterestRates_USD_20140207.zip

 

The above curl command was repeated for each of the 10 currencies. Once we had unzipped our files the next step was to get the raw xml data into our staging table. This was just a series of 10 inserts like the one shown below.

 

insert into isda_stage VALUES

(

SYS.XMLType

  (

    bfilename('ISDADIR', 'InterestRates_USD_20140207.xml'),nls_charset_id('AL32UTF8')

  )

)

 

Note the use of the bfilename keyword which allows us to load data into our staging table directly from a file which is placed in a named Oracle directory. Our final step is to retrieve the data from the staging table, converting it to regular relational data before inserting it into our final destination table isda_curves described above.


This was the tricky bit for me but eventually after much trawling through the documentation and the web I came across this solution so here’s how I did that bit.

 

insert into isda_curves

(

 curvename,

 curvecurrency,

 curvedate,

 mid,

 ident,

 maturity,

 spot_date,

 type,

 fixed_basis,

 float_basis,

 fixed_frequency,

 float_frequency

)

SELECT distinct

   'isda_curve'                     curvename,

   y.currency                        curvecurrency,

   y.thedte                           curvedate,

   100*x.mid                        mid,

   x.period                            ident,

   x.maturity                        maturity,

   y.spot_date                    spot_date,

   'CA'                                     type,

   y.fixed_daycount         fixed_basis,

   y.float_daycount          float_basis,

   'A'                                        fixed_frequency,

   'A'                                        float_frequency

FROM

   isda_stagei,

   xmltable

   (

   'interestRateCurve'

   PASSING

   i.xmldata

   COLUMNS        currency varchar2(3) PATH 'currency',

                                thedte date PATH 'effectiveasof' ,

                                spot_date date PATH 'deposits/spotdate',

                                fixed_daycountvarchar2(10) PATH 'deposits/daycountconvention',

                                float_daycountvarchar2(10) PATH 'deposits/daycountconvention',

                                float_freqvarchar2(10) PATH 'deposits/fixedpaymentfrequency' ,

                                fixed_freqvarchar2(10) PATH 'deposits/floatingpaymentfrequency'

   ) y,

   XMLTABLE

   ('

      interestRateCurve/deposits/curvepoint'

   PASSING

      i.xmldata

   COLUMNS        mid number PATH 'parrate',

                                period varchar2(2) PATH 'tenor',

                                maturity date PATH 'maturitydate'

   )  x

union all

SELECT distinct

   'isda_curve'     curvename,

   a.currency        curvecurrency,

   a.thedte           curvedate,

   100*b.mid        mid,

   b.period            ident,

   b.maturity        maturity,

   a.spot_date    spot_date,

   'SW'                    type,

   a.fixed_daycountfixed_basis,

   a.float_daycountfloat_basis,

   decode(a.fixed_freq,'3M','Q','6M','S','A') fixed_frequency,

   decode(a.float_freq,'3M','Q','6M','S','A') float_frequency

FROM

   isda_stagei,

   xmltable

   (

   'interestRateCurve'

   PASSING

   i.xmldata

   COLUMNS        currency varchar2(3) PATH 'currency',

                                thedte date PATH 'effectiveasof' ,

                                spot_date date PATH 'swaps/spotdate',

                                fixed_daycountvarchar2(10) PATH 'swaps/fixeddaycountconvention',

                                float_daycountvarchar2(10) PATH 'swaps/floatingdaycountconvention',

                                float_freqvarchar2(10) PATH 'swaps/fixedpaymentfrequency' ,

                                fixed_freqvarchar2(10) PATH 'swaps/floatingpaymentfrequency'

   ) a,

   XMLTABLE

   ('

      interestRateCurve/swaps/curvepoint'

   PASSING

      i.xmldata

   COLUMNS        mid number PATH 'parrate',

                                period varchar2(2) PATH 'tenor',

                                maturity date PATH 'maturitydate'

   )  b