Page 1 of 1

Retrieve web page that involves login

Posted: Wed Apr 14, 2010 1:50 am
by lohill
There is a web page I would like to get some data from. It is a site that requires a password but I have set my Safari browser so that when I go there via the browser I do not need to supply a password. Similarly on the VM Fusion/Windows side of my machine I have the same kind of situation in FireFox. On that Windows side I also have a number of Excel macros that can get data from that web page.

When I try to access that page from REV however, I get treated with the generic response a person without a login would get treated. I am definitely not an expert on http and web access but I did find a discussion in this forum that got me started on something. As a result I can find the cookies that get generated in my REV process bit I don't know what to do with them when I get them.

I have a very simple stack for testing. One field called 'LogField' and one button called 'Laumch'. The script for the button is the following:

Code: Select all

on mouseUp
   put empty into field "LogField"
   libURLsetLogField "LogField"
   put "http://www.investors.com/StockResearch/Quote.aspx?symbol=MSFT" into myURL
   put url myURL into myRetrieve
   put libURLLastRHHeaders() into tRHheaders
   put "" into tCookies
   repeat with i = 1 to number of lines of tRHheaders
      If offset("Set-Cookie:", line i of tRHheaders) = 1 then
         put line i of tRHheaders & return after tCookies
      end if
   end repeat
   if tCookies > "" then
      --set the httpHeaders to tCookies
      --put url myURL into myRetrieve1
      --put offset("Group RS Rating",myRetrieve1) into tSpot
   else
      answer "Cookies not found."
   end if
end mouseUp
It is at the commented out part where I need my help. In fact if I let the commented part execute I actually get errors on all subsequent tries. I have to quit REV and and then when I come back in to the stack I have to comment out that section to actually see the cookies. I have tried this on both the Mac side and Windows side with similar results.

Any help you can give me will be appreciated.

Thanks in advance,
Larry

Re: Retrieve web page that involves login

Posted: Wed Apr 14, 2010 6:40 pm
by lohill
Further study has led me to delimiting the list of Cookies that I retrieve with semi-colons rather than returns. I still am not sure what to do with them. I also read something about '<iframe>' and 'src' and I can capture them from myRetrieve, but again, I don't know what to do with them.

Any help, example or refrences appreciated.
Thanks,
Larry

Re: Retrieve web page that involves login

Posted: Thu Apr 15, 2010 6:30 pm
by lohill
I have been able to extract all the '<iframe' data from myRetrieve and there are four of them:
<iframe src='/SiteAds/TopInHouseAd.aspx?page=/StockResearch/Quote.aspx&identifier=' marginwidth='0' marginheight='0' frameborder='0' scrolling='no'>
<iframe src='/SiteAds/InHouseAd.aspx?page=/StockResearch/Quote.aspx&identifier=' marginwidth='0' marginheight='0' frameborder='0' scrolling='no'>
<iframe src='/SiteAds/TradingCenterVertical.aspx?page=/StockResearch/Quote.aspx&identifier=' marginwidth='0' marginheight='0' frameborder='0' scrolling='no'>
<iframe src='/SiteAds/BottomAd.aspx?page=/StockResearch/Quote.aspx&identifier=' marginwidth='0' marginheight='0' frameborder='0' scrolling='no'>
It is my understanding that one of them is the redirect to where I want to go provided I pass it with along with the proper cookies. Is this correct? My guess is that it would be the third one. If that is the case, I need to know exactly what gets passed. Would it be everything after the single quote and to, but not including, the '>'? And then my final question (again). How does it get passed? Is it appended to the original URL and then sent by 'put URL myURL ...' or do I have to write to a socket which I guess could be captured from the log information?

As you can see, I need the advice of someone with some experience at this.

Larry