Quantcast
Channel: VBForums - Visual Basic .NET
Viewing all articles
Browse latest Browse all 27350

VS 2012 VB.net extract links from google-search using HtmlAgilityPack

$
0
0
My code works 100% but my problem is it does not display the complete links, some are separated with dots

Here is my working code:

Code:

Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click

        Dim webClient As New System.Net.WebClient
        Dim WebSource As String = webClient.DownloadString("http://www.google.com.ph/search?hl=en&as_q=test&as_epq=&as_oq=&as_eq=&as_nlo=&as_nhi=&lr=&cr=countryCA&as_qdr=all&as_sitesearch=&as_occt=any&safe=images&tbs=ctr%3AcountryCA&as_filetype=&as_rights=#as_qdr=all&cr=countryCA&fp=1e63a873f2e9c884&hl=en&lr=&q=test&start=20&tbs=ctr:countryCA")
        RichTextBox1.Text = WebSource

        Dim links As New List(Of String)()
        Dim htmlDoc As New HtmlAgilityPack.HtmlDocument()
        htmlDoc.LoadHtml(WebSource)

        For Each link As HtmlNode In htmlDoc.DocumentNode.SelectNodes("//cite")

            If link.InnerText.Contains("test") Then
                ListBox1.Items.Add(link.InnerText)
            End If

        Next

    End Sub

and here is the output displayed in the listbox:

Code:

www.wherecreativitygoestoschool.com/vancouver/left.../rb_test.htm
www.icbc.com/driver-licensing/getting-licensed/.../skills-test
www.drivetest.ca/
www.drivetest.ca/EN/bookatest/Pages/Road-Test-Booking.aspx
www.drivetest.ca/EN/drivereducation/Pages/Driver-Testing.aspx
www.cic.gc.ca/english/citizenship/cit-test.asp

The 1st and 2nd link is my problem. It is incomplete, it has dots. How do i display the complete link? Do i have to change something with my code? Any help or advice will be gladly accepted, thanks in advance.

Please bear with me I am still learning the basics and willing to learn.

Viewing all articles
Browse latest Browse all 27350

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>