Tutorial #2. Parser.
This tutorial describes how to create standalone network script, which parses google output.
This tutorial contains a lot of steps, if you want to see them in action, you can skip it, by checking video.
By finishing this tutorial, you will know how to use functions, loops and variables in BrowserAutomationStudio. You will also know, how to parse page content.
To start creating script, you need to run Browser Automation Studio and hit record button.
Hit load button to make Browser Automation Studio load google page.
Input google.com as load url and hit ok.
Create resource to let user input.
Restart script and set query to "cats".
Use type action.
Select recourse.
Start script. Google site will be loaded and query, which user has inputted will be typed.
Search is performed with Enter key. Need to type enter with Browser Automation Studio.
Use type action and special key <RETURN> to do that.
Now let's parse first reference text. To do that move mouse pointer on element and click with left mouse button.
Select "Get Element Text" menu entry.
After you hit ok button, first reference text will be saved to SAVED_TEXT variable. You can select another variable name.
Use log action to output variable.
Use @ button to select variable.
Select SAVED_TEXT variable.
Input field value will be [[SAVED_TEXT]]. It means, that SAVED_TEXT value will be logged. Resources are framed with {{}} and variables with [[]].
You can also add custom text to input field.
You should see following result in log panel after action will execute.
Now let's parse url. Select "Get Attribute" menu entry.
Set attribute name to "href". After you hit ok, result will be saved to SAVED_ATTRIBUTE
Log SAVED_ATTRIBUTE in the same way, as did with SAVED_TEXT
Script processes only first google search result. If you need to parse every result, need to use "For Each Element" menu to parse all of them
First, select all four parse actions and press "Del" to remove it.
Select "Start Loop" action.
To start loop you need to select query, which can enumerate all elements, which participate in parsing. Current query is :nth-child(1) > .rc > .r > a Remove :nth-child(1) > because this part is responsible for selecting first element and we need all of them.
Set query to .rc > .r > a and continue
Move cursor inside for loop. Actions, that you place inside for loop will be executed on every iteration.
Use "Get Element Text" from "For Each Element" menu to parse element text. "For Each Element" menu has same entries as main menu, it is intended to use only inside loop, which created with "Start Loop" action.
It has identical interface to original "Get Text" action.
Log result, save element url to SAVE_ATTRIBUTE variable and log it. Same way as you did with first element previously. After all, acion panel should look like on screenshot.
Restart script and your log will contain all links from page!
Right now action panel holds a lot of items. Let's create function "ParseGooglePage" and place all the code which do parsing there. Click on the + button at the bottom of action panel.
Type "ParseGooglePage" inside popup.
New function is created! It is empty by default, but we will soon put content there.
Go back to main function by manipulating combobox.
Select code, which does parsing and hit Ctrl-X. This shortcut will cut "For" action and all descendants.
Switch back to "ParseGooglePage" and paste actions with Ctrl-V shortcut.
If you run code as is, page won't be parsed, because there is no function call. Let's fix that by selecting "Call Function" action inside "Main"
Select function and hit ok.
Action bar looks more compact now.
So far script parses only first page, in next steps we upgrade it to parse custom page number. To achieve this, we need to add For action.
For action has two parameters: from and to. Both are integers, both can be loaded from resource. But let's set it to 1 and 3 for now. This means, that loop will be executed 3 times and will parse 3 first google pages.
Copy "ParseGooglePage" function call inside loop.
Last step is to click on next page link. It can be done with "Move And Click On Element" function
Thats it! Script parses first 3 google pages now.