How can I find a table after a text string using BeautifulSoup in Python?

I am trying to extract data from several web pages which are not uniform in how they display their tables. I need to write code that will search for a text string and then go to the table...

How can I turn <br> and <p> into line breaks?

Let's say I have an HTML with <p> and <br> tags inside. Aftewards, I'm going to strip the HTML to clean up the tags. How can I turn them into line breaks? I'm using Python's BeautifulSoup...

Converting html to text with Python

I am trying to convert an html block to text using Python. Input: <div class="body"><p><strong></strong></p> <p><strong></strong>Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean...

How to retrieve the values of dynamic html content using Python

I'm using Python 3 and I'm trying to retrieve data from a website. However, this data is dynamically loaded and the code I have right now doesn't work: url = eveCentralBaseURL +...

Python: Need to wait before BeautifulSoup and Urllib can parse a website

I am trying to get the current world population in real-time but when the webpage first loads up it takes a couple seconds to retrieve the data. When i run the program i get loading... instead of...

parse tables from a PDF document

The PDF in this link (http://www.lenovo.com/psref/pdf/psref450.pdf) contains a number of tables like this: I'd like to programmatically extract the data and the structure from these...

Web scraping without knowledge of page structure

I'm trying to teach myself a concept by writing a script. Basically, I'm trying to write a Python script that, given a few keywords, will crawl web pages until it finds the data I need. For...

bs4.FeatureNotFound: Couldn't find a tree builder with the features you requested: lxml. Do you need to install a parser library?

... soup = BeautifulSoup(html, "lxml") File "/Library/Python/2.7/site-packages/bs4/__init__.py", line 152, in __init__ % ",".join(features)) bs4.FeatureNotFound: Couldn't find a tree builder with...

Extract text inside HTML paragraph using BeautifulSoup in Python

<p> <a name="533660373"></a> <strong>Title: Point of Sale Threats Proliferate</strong><br /> <strong>Severity: Normal Severity</strong><br /> <strong>Published: Thursday, December 04, 2014...

Open web in new tab Selenium + Python

So I am trying to open websites on new tabs inside my WebDriver. I want to do this, because opening a new WebDriver for each website takes about 3.5secs using PhantomJS, I want more speed... I'm...

How can I load a saved JSON tree with treelib?

I have made a Python script wherein I process a big html with BeautifulSoup while I build a tree from it using treelib: http://xiaming.me/treelib/. I have found that this library comes with...

Pull Data/Links from Google Searches using Beautiful Soup

Evening Folks, I'm attempting to ask Google a question, and pull all the relevant links from its respected search query (i.e. I search "site: Wikipedia.com Thomas Jefferson" and it gives me...

PyCharm Error Loading Package List

I just downloaded PyCharm the other day and wanted to download a few packages. I'm using Windows 10 with PyCharm 2016.2.2 and python 3.5.2. When I go to the available packages screen it...

Python Library not recognized on Spark Cluster

I have a Spark Dataframe which has a text data. I am trying to clean the html markups from the data using Python BeautifulSoup Library. However, when I use BeautifulSoup on Spark installed locally...

How to Join Multiple Lists for Python - BeautifulSoup NLTK Analysis

Python newbie here, working on my first web scraping/word frequency analysis using BeautifulSoup and NLTK. I'm scraping Texas' Dept of Justice archive of offenders last statements. I've gotten to...

How to deploy a python scraper in the cloud?

I have some python scrapers (scripts) that I would like to deploy in the cloud in order to make them running from time to time using some sort of a scheduler or cronjob. The problem is that I...

Web scraping - Get text from a class with BeautifulSoup and Python?

I want to scrape the text ("Showing 650 results") from a website. The result of I am looking for is: Result : Showing 650 results The following is the Html code: <div...

Web scraping using selenium and beautifulsoup.. trouble in parsing and selecting button

I am trying to web scrape the following website "url='https://angel.co/life-sciences' ". The website contains more than 8000 data. From this page I need the information like company name and link,...

Recursion Depth Exceeded, pickle and BeautifulSoup

I want to pickle html from websites. I save the html to a list and try to pickle it. An example of one such list is the html from brckhmptn.com/tour. Of course the html from this site is a lot, is...

ModuleNotFoundError: No module named 'google'

Once I am trying to use google search api its showing me an error Traceback (most recent call last): File "C:\Users\Maor Ben Lulu\Desktop\Maor\Python\google\google_Bot.py", line 1, in <module> ...

AttributeError: 'NoneType' object has no attribute 'find' within tbody

I am kind off a newbie within the python field and try to set up a webscraping tool. So I am experimenting some codes. import requests import bs4 website =...

Max retries exceeded with url requests Python

I am trying to web scrape this page and the code i use is this: page = get("https://www.uobgroup.com/online-rates/gold-and-silver-prices.page") I get this error when i run this code: Traceback...

How to scrape images from DuckDuckGo's image search results in Python

I'm creating an application with python that's going to show images scraped from DuckDuckGo's image search results. So I need to get a list of links to the images based on the search. The problem...

What is the difference between find() and find_all() in beautiful soup python?

I was doing web scraping but i stuck/confused in find() and find_all(). Like where to use find_all, where to user find(). Also, where can i use this methods like in for loop or in ul li list...

how do i webscrape using python and beautifulsoup?

Im very new to this, but I have an idea for a website and I want to give it a good go, my aim is to scrape the Asda website for prices and products, more specifically in this case whiskey. I want...

No results in scraping bing search

i use code below to scrape results from bing and when I see the scraped web page it says "There are no results for python". but when I search in the browser there is no problem. import...

With Python i want to post on facebook timeline, but not using API, whats wrong?

I want to make a code to login on facebook with Python and post on facebook timeline, but not using facebook API because i dont like it. from selenium import webdriver from getpass import...

How to scrape elements from a HTML Dygraph?

I'm trying to fetch all the datapoints from this website https://bitinfocharts.com/comparison/bitcoin-transactions.html using BeautifulSoup and requests in Python. So far I have the code: session...

Python-- webscraping for the content in "expand" button with beautifulsoup

I am scraping a yellow page to get the name of all physiotherapists in a city. With the url I get the list of 50 physiotherapists, however, when I expand the page, the url does not change. How do...

sh.CommandNotFound: ./compile.sh Buildozer and KivyMD error

OS: (VM) Ubuntu 20.04.2 Python3 : 3.8.5 I've been trying to compile a an application I created using the following modules: kivy,kivymd,pafy,vlc,sys, threading,time. And also a local custom module...