Parsing a website using 1C

Alexander Biryukov

17.02.2021 14 min

Parsing

Parsing a site using 1C. Obtaining exchange rates from the website of the Central Bank of Georgia.

Although the global pandemic has been going on for a year now, the business continues to operate, including international trade. And if our company does business with partners from other countries, then surely we will face the task of maintaining accounting of goods or products in different currencies, unless of course, this is not the eurozone.

The good news is that 1C solutions work well with multicurrency accounting.

So let's say our main business is in Turkey and one of our suppliers is in Georgia. And we need to know the rate of the Georgian currency relative to the Turkish one.

What's our first step? That's right, we go to the website of the Central Bank of Georgia and find the section "Currency rates" there: https://www.nbg.gov.ge/index.php?m=582
1
As you can see, all the necessary information on exchange rates is on this page. But if we now try to parse this page, it can be somewhat tricky since this page contains much unnecessary information that can complicate the analysis.

Fortunately, the developers of the site have provided an opportunity to get the exchange rate at this link: http://www.nbg.ge/rss.php more easily

Besides, in the request, we can pass as a parameter the date on which we need to know the value of the rate, which also simplifies our further work. The screenshot below shows the query result with the specified Date parameter:

2.png

If you now carefully look at the code that is returned by this link, you will notice that the data on exchange rates are located in an HTML table. Each rate value has several attributes.

These attributes are as follows: international code of currency, currency description, rate value itself. Then comes the HTML picture element, which can be either Green or Red. This attribute shows in which direction the rate of a given currency is moving - up or down. And the last attribute is the value of the deviation of the current exchange rate from the previous one for yesterday.

Also, in our example, the description of currencies is given in Georgian. Let's leave this as it is for now - and we'll come back to that later.

So, as you can see, this HTML page structure is quite simple, and we have every chance to get information from it for our 1C successfully.

Well - let's start!

Let's open the Designer and create a new external processor in it, as well as a new form:

3.png
To begin with, let's create the "GetRates" command and a procedure for processing this command on the form:

4_1.png
Let's remember again - we have to send a request to the site and receive some data in response. Naturally, in 1C we use the HTTPConnection object and its Get method. Thus, our procedure looks something like this:


&AtClient

Procedure GetRates(Command)

       Connection = New HTTPConnection("www.nbg.gov.ge");

       request = New HTTPRequest("rss.php?date=2021-02-02");

       HTTPAnswer = Connection.Get(request);

              

       If HTTPAnswer.StatusCode = 200 Then

                  stringResult = HTTPAnswer.GetBodyAsString();

                  If Not TypeOf(stringResult) = Type("String") Then

                                Return;

                  EndIf;

                             

       EndIf;

EndProcedure



And let's check out how this code works right away. We set a breakpoint and start 1C in dialog mode:

5.png

As you can see, we received data in 1C, and now this data needs to be processed. Of course, you can write your own parsing algorithm for this, but 1C has special classes (objects) for performing such tasks.

If you look closely at the previous screenshot, you will probably notice that the site returned us data in XML format. So maybe we should use the XMLReader object? This object is excellent for working with XML data.

But unfortunately, in this case, we cannot use the XMLReader object. The reason for this is straightforward. The fact is that our main data - exchange rates - is stored in an HTML table, and using XMLReader, we cannot read this data.

Since our data is stored in the form of an HTML table, we need to read it using the HTMLReader object. Let's try to do this.

As a reminder, the HTMLReader is always used in conjunction with the DOMBuilder. The HTMLReader object reads HTML data, and the DOMBuilder object builds the document structure from this data.

The source code for our procedure, in this case, looks like this:

&AtClient

Procedure GetRates(Command)

              

         Connection = New HTTPConnection("www.nbg.gov.ge");

         request = New HTTPRequest("rss.php?date=2021-02-02");

         HTTPAnswer = Connection.Get(request);

              

         If HTTPAnswer.StatusCode = 200 Then

                             

                      stringResult = HTTPAnswer.GetBodyAsString();

                             

                       If Not TypeOf(stringResult) = Type("String") Then

                                           

                                Return;

                                                                                        

                        EndIf;

                             

                         //**********************************************

                        HTMLReader = New HTMLReader;

                        HTMLReader.SetString(stringResult);

                             

                        DOMBuilder = New DOMBuilder;

                             

                        documentHTML = DOMBuilder.Read(HTMLReader);

                             

           EndIf;

              

EndProcedure



Let's test this code in action and see what it returns. We are interested in the structure of the documentHTML object:

6.png
We put a breakpoint, run our code and see what data got into documentHTML:

7.png
By sequentially going through the document's child nodes, we can get to the data of interest to us. But we can simplify this a little by using the "GetElementByTagName" method of the documentHTML object. Let's take another look at the XML data structure. As you can see, the data we are interested in is in the "description" node:

8.png
True, there are two nodes with such a description, but in any case, finding the desired node is greatly simplified. So, we add the line to our procedure:

listNodes = documentHTML.GetElementByTagName("description");

This allows us to select nodes in the document only with a specific description. And immediately check our code interactively:

9.png
As you can see from the screenshot, we get only two nodes, and we are interested in the node with index 1, which most likely contains the data we need.

How did I get it? It's very simple! Notice the FirstChild attribute of this node. The type of this attribute is HTMLTableRowElement. That is, this node contains an HTML table within it.

We expand the ChildNodes of this node, and voila - this is an HTML table and our data. In this case, the final path to the required data looks like this:

10.png
Well, it's time to get back to writing the code. But before that, let's place a table on the form in which the received data will be displayed:

11.png

and add the following code to the procedure, respectively:


If TypeOf(listNodes) = Type("DOMElementList") Then

    For Each node In listNodes Do

            If TypeOf(node.FirstChild) = Type("HTMLTableRowElement") Then

                    For Each childNode In node.ChildNodes Do

                            If TypeOf(childNode) = Type("HTMLTableRowElement") Then

                                                                         

                                    newRate = ExchangeRates.Add();

                                                                         

                                    newRate.Currency = childNode.Cells[0].TextContent;

                                    newRate.CurrencyDescription = childNode.Cells[1].TextContent;

                                    newRate.Rate = childNode.Cells[2].TextContent;

                                    newRate.Variation = childNode.Cells[4].TextContent;

                                                                                        

                            EndIf;

                    EndDo;

              EndIf;

    EndDo;

EndIf;




And of course, let's immediately check how it works:

12.png
As you can see, everything works. But there are several disadvantages.

Firstly, we see exchange rates for only one date, and it would be nice to allow the user to choose the date of the exchange rate himself.

Second, the description of currencies in Georgian. Of course, the Georgian alphabet is very ancient, much older than the Latin alphabet, but not everyone can read it. Therefore, we need to get a description of currencies in English.

All this is very simple to do, insert the necessary parameters into our GET request, something like this: "?date=2020-10-30&lang=eng".

Let's change the line: request = New HTTPRequest("rss.php?date=2021-02-02");

to this: request = New HTTPRequest(GetTextForQuery());

and add a procedure GetTextForQuery():

&AtClient

Function GetTextForQuery()

      stringResult = "rss.php?date=" + GetDateForQuery();

              

       If English Then

                   stringResult = stringResult + "&lng=eng";

        EndIf;

              

        Return stringResult;

              

EndFunction



&AtClient

Function GetDateForQuery()

              

        Return Format(RatesDate,"DF = ""yyyy-MM-dd""");

              

EndFunction




We will also add two new attributes English and RatesDate to the form. For the convenience of displaying, we will bring these attributes and the GetRates command into one group:

13.png

And don't forget to add the line  ExchangeRates.Clear(); at the beginning of the procedure.

Well, let's check it out! And surprisingly - everything works! And the choice of the date and the description of currencies in English!

14.png
We can say that we have done everything, but we still have one more parameter that we did not render. It shows how direction the course has changed with the previous day - whether it rose or fell.

Again, we set a breakpoint and look for the node in which the attribute we need is located:
 
15.png

After that we see that on the path "childNode.Cells[3].ChildNodes[0].Src" there is a link to the file or "red.gif" or "green.gif". It makes no sense for us to receive the files from the server, and we will simply process this information.

Let's add the following code to our procedure:

If Not StrFind(childNode.Cells[3].ChildNodes[0].Src,"green") = 0 Then

               newRate.UpDown = 1;

ElsIf Not StrFind(childNode.Cells[3].ChildNodes[0].Src,"red") = 0 Then

               newRate.UpDown = 2;

EndIf;



and check how this code is executed:

16.png
As you can see, everything works correctly. But I would like to see not numbers 1 or 2, but some visual information in the currency rate movement up or down. Let's try to add this feature.

First of all, change the element type "ExchangeRatesUpDown" to "Image field":

17.png
Then we need to fill in the "ValuesPicture" attribute. To do this, we need to select a graphic file that contains the necessary images:

18.png

The file format is very simple - the file itself is in PNG format, each icon must be 16 by 16. All icons are arranged in one line, and its index can access each icon. The index starts at zero. In our example, the green icon can be referenced at index 1, and the red icon at index 2.

After making these changes, we save our code and, of course, run it interactively. As you can see, everything works, and now we can visually see which currency has fallen and which has grown:

19.png

Well, it's time to take stock. We have shown how you can get information from the site you are interested in and upload this data directly to your solution on the 1C platform. In our example, we received data on exchange rates from the Bank of Georgia website and can now use this data to create accounting applications.

And by the way - Georgia is a beautiful country. I highly recommend going there at least once.

You can download the Demo Data Processor  to learn this 1C mechanism properly. If you have any questions about this article, you can always get answers on our forum: https://1c-dn.com/forum/
Stay tuned - there is still a lot of new things to come!

Be the first to know tips & tricks on business application development!

A confirmation e-mail has been sent to the e-mail address you provided .

Click the link in the e-mail to confirm and activate the subscription.