Python PIL: IOError: không thể xác định tập tin ảnh

tôi đang cố gắng để có được những hình ảnh từ URL sau:Python PIL: IOError: không thể xác định tập tin ảnh

image_url = http://www.eatwell101.com/wp-content/uploads/2012/11/Potato-Pancakes-recipe.jpg?b14316

Khi tôi tìm đến nó trong một trình duyệt, nó chắc chắn trông giống như một hình ảnh. Nhưng tôi nhận được một lỗi khi tôi thử: hàng trăm

import urllib, cStringIO, PIL 
from PIL import Image 

img_file = cStringIO.StringIO(urllib.urlopen(image_url).read()) 
image = Image.open(img_file)

IOError: cannot identify image file

tôi đã sao chép các hình ảnh theo cách này, vì vậy tôi không chắc chắn những gì đặc biệt ở đây. Tôi có thể lấy hình ảnh này không?

Nguồn

2013-06-18 user984003

Vấn đề không tồn tại trong hình ảnh.

>>> urllib.urlopen(image_url).read() 
'\n<?xml version="1.0" encoding="utf-8"?>\n<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"\n "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">\n<html>\n <head>\n <title>403 You are banned from this site. Please contact via a different client configuration if you believe that this is a mistake.</title>\n </head>\n <body>\n <h1>Error 403 You are banned from this site. Please contact via a different client configuration if you believe that this is a mistake.</h1>\n <p>You are banned from this site. Please contact via a different client configuration if you believe that this is a mistake.</p>\n <h3>Guru Meditation:</h3>\n <p>XID: 1806024796</p>\n <hr>\n <p>Varnish cache server</p>\n </body>\n</html>\n'

Sử dụng user agent header sẽ giải quyết được sự cố.

opener = urllib2.build_opener() 
opener.addheaders = [('User-agent', 'Mozilla/5.0')] 
response = opener.open(image_url) 
img_file = cStringIO.StringIO(response.read()) 
image = Image.open(img_file)

Nguồn

2013-06-18 21:07:52 KostasT

Làm việc như một sự quyến rũ. – user984003

Heder này cũng được yêu cầu cho một số hình ảnh khác: ('Chấp nhận', 'văn bản/html, ứng dụng/xhtml + xml, ứng dụng/xml; q = 0,9, */*; q = 0,8') – user984003

khi tôi mở tệp bằng

In [3]: f = urllib.urlopen('http://www.eatwell101.com/wp-content/uploads/2012/11/Potato-Pancakes-recipe.jpg') 

In [9]: f.code 
Out[9]: 403

này không được trả lại một hình ảnh.

Bạn có thể thử chỉ định tiêu đề tác nhân người dùng để xem liệu bạn có thể lừa máy chủ nghĩ rằng bạn là trình duyệt hay không.

Sử dụng requests thư viện (vì nó là dễ dàng hơn để gửi thông tin tiêu đề)

In [7]: f = requests.get('http://www.eatwell101.com/wp-content/uploads/2012/11/Potato-Pancakes-recipe.jpg', headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:16.0) Gecko/20100101 Firefox/16.0,gzip(gfe)'}) 
In [8]: f.status_code 
Out[8]: 200

Nguồn

2013-06-18 21:02:39 dm03514

Để lấy một số hình ảnh, trước tiên bạn có thể lưu hình ảnh và sau đó tải hình ảnh đó vào PIL. ví dụ:

import urllib2,PIL 

opener = urllib2.build_opener(urllib2.HTTPRedirectHandler(), urllib2.HTTPCookieProcessor()) 
image_content = opener.open("http://www.eatwell101.com/wp-content/uploads/2012/11/Potato-Pancakes-recipe.jpg?b14316").read() 
opener.close() 

save_dir = r"/some/folder/to/save/image.jpg" 
f = open(save_dir,'wb') 
f.write(image_content) 
f.close() 

image = Image.open(save_dir) 
...

Nguồn

2014-07-11 06:45:47

ok, cảm ơn, tôi quên sử dụng 'b', chế độ nhị phân – ghanbari

Python PIL: IOError: không thể xác định tập tin ảnh

Trả lời

Các vấn đề liên quan