Logging Into a site that uses Live.com authentication

I've been trying to automate a log in to a website I frequent, www.bungie.net. The site is associated with Microsoft and Xbox Live, and as such makes uses of the Windows Live ID API when people...

Making bots that navigate the web

I've always wanted to automate some things like in browser games, such as OGame, or Facebook poking. I could use the java.awt.Robot class, which is the only solution I've found out there, but that...

Ruby on Rails, How to determine if a request was made by a robot or search engine spider?

I've Rails apps, that record an IP-address from every request to specific URL, but in my IP database i've found facebook blok IP like 66.220.15.* and Google IP (i suggest it come from bot). Is...

Exclude bots and spiders from a View counter in PHP

I have built a pretty basic advertisement manager for a website in PHP. I say basic because it's not complex like Google or Facebook ads or even most high end ad servers. Doesn't handle payments...

Simple Python Crawler / Spider Runtime Error

I have a simple python crawler / spider that searches for a specified text on a site that i provide. But in some sites it crawls normally for 2-4 sec until an error is occurred. The code so...

Any alternatives to Spiderable?

Our version of Meteor is 0.8.1, which means it crashes when we try to install the current version of Spiderable. PhantomJS has something to do with this incompatibility because it has some...

In Meteor : How to apply OpenGraph Dynamically for search engines (google+ or Facebook)

Actually what I want to do is make my pages link to be shown on the google+ or Facebook post with the OpenGraph tags. I made my post page changes the <meta property="go:title" ... tag dynamically...

Facebook og: tags not found - operation timeout (Meteor)

I'm working on a meteor application and trying to get the Facebook meta data to show up when people share links. We are using Iron Router and would eventually like to have dynamic meta content,...

HTML snippets for AngularJS app that uses pushState?

I'm deciding whether it's safe to develop my client-facing app in AngularJS using pushState. I've read that when using pushState in an AngularJS app, we don't need to worry about Googlebot because...

nginx - no ssl for crawlers / bots

In a server directive for port 80 in nginx I want to redirect all requests to https if the user agent is not a bot. I tried using this: ... location / { if ($http_user_agent !~*...

Get link and text using scrapy

I want to find the urls of a web page with specific regex. I used scrapy package in python. My code looks like this name = 'testingcode' start_urls = ['http://dinoopnair.blogspot.in/'] # urls...

How to run a spider from bat file for multiple urls?

I wanted to prepare a multiple.bat file to run several spiders so I first tried to prepare multiple.bat file for one spider. I got stopped here. I got this...

Scrapy, how to limit time per domain?

I have been searching for an answer and there is no answer on this forum although several questions have been asked. One answer is the it is possible to stop spider after certain time but that is...

How can I find a URL in a span that in a div?

I am trying to find a URL thats in a span, which is in a div. In this case, its the link with the class "company_url" that I'm after. <div class="links standard"> <span class="link"> ...

Run parallel parse function python scrapy

i'm using scrapy mixing with selenium. I want to run my parse function in many task in parallelism. I want to open many url simultaneously. So i use Pool.map function to map my parse() function to...

How to scrape and parse nested div with scrapy

Trying to follow this github page in order to learn crawl nested divs in facebook....

Redirecting Facebook/Twitter/Google Spiders with web.config to a folder with a php script

I am running an Angular application that does not play well with web spiders. So I am instead going to redirect all spiders to a php page that renders the meta data on the server side. here is the...

What are the best practices for calling an external api?

So let's say I want to write a spider that using the Facebook API to calculate the likes on every page of a website. If I import the requests library, I'm able to call the Facebook graph API as...

need to use arrow keys to move through search suggestions

i got a search bar with good suggestions and i need to be able to use the arrow keys to move through the options, but it doesn't work and i am really confused, please can you help me out, i have...

Scrapy FormRequest from response AtrributeError: 'str' object has no attribute 'encoding'

I am trying to login to Facebook using Scrapy. I have identified that mobile version of Facebook works without javascript, so I am using it. The relevant code is from loginform import...

Scraping from list of urls using selenium and scrapy

I'm New to scrapy and Python. I'm Trying to scrape from a list of URLs Using Selenium and scrapy. I tried this code: class TechstarSpider(scrapy.Spider): name = "techstar" ...

Why there is an attributeError in MongoDB's collection?

After running a script, I'm getting this error while I'm successful in yielding the data in .csv file.Here is the error Traceback (most recent call last): File...

Scraping some Facebook data but not all? Scrapy/Splash/Python

I have a spider that looks like this: import scrapy from scrapy_splash import SplashRequest class BarkbotSpider(scrapy.Spider): name = 'barkbot' start_urls = [ ...

Scrapy - change settings at runtime based on attribute provided

I'm having fun with scrapy, working on this project, a spider for facebook's posts. I would like to change the CONCURRENT_REQUESTS parameter in settings.py at runtime, if a boolean attribute is...

Trying to iterate through rows in a pandas dataframe, grab the list of urls in each row, and extract emails using scrapy

I wrote some code that adds lists of relevant urls to a row in a pandas dataframe. Now, I'm trying to iterate through those lists and search each one for email addresses. Here's what I have thus...

Why is the vertical scroll getting stuck in mobile view?

The webpage, when in mobile view, is getting stuck when I attempt to scroll vertically. In @media screen and (max-width: 952px) {} I have overflow-x: hidden;, width: 100%;, and position: absolute;...

Why am I getting 403 error in scraping using scrapy?

I am getting the following error even though I have seen many qustions related to 403 forbidden in scrapy and changed according to those answer. I have changed the user-agents as well as rotated...

Scrapy: parse callback is not defined

I always get NotImplementedError('{}.parse callback is not defined'.format(self.__class__.__name__)). however, I tried to follow the example here. 2019-12-27 11:40:40 [scrapy.core.engine] DEBUG:...

nginx secure link sometimes return 403 error

In the logs of my server I often find a 403 error when a user accesses mp4 files, links to which are hidden using a secure link in nginx. Most users do not face such difficulties. But only a few...

Game doesn't connect to server via Sockets on iOS, but does on other devices

Something is preventing my game from connecting to a server, and it fails to get approved the Appstore Review Team. The game was released on GooglePlay and successfully works from all Android...