Author Topic: [Python] Manga downloader  (Read 871 times)

0 Members and 1 Guest are viewing this topic.

Offline bonewagon

  • NULL
  • Posts: 2
  • Cookies: 4
    • View Profile
[Python] Manga downloader
« on: March 19, 2014, 09:24:09 pm »
This can download a manga (one at a time) from mangareader.net for offline reading.

Code: (python) [Select]
#!/usr/bin/python
import urllib2
from re import compile, findall
from sys import argv
from os import mkdir

chapter_exp = compile('(?<=href=")/(?:\d+-)+\d+/[\w-]+/chapter-\d+\.html') # extract the URLs for each chapter
page_exp = compile('(?<=value=")/(?:\d+-)+\d+/[\w-]+/chapter-\d+\.html') # the URLs for each page in the chapter
img_exp = compile('(?<=src=")https?://i\d*\.mangareader\.net/.*\d+\.jpg') # the image for the page

chapter_number = page_number = 1

def get_matches(url, exp):
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0 (Windows NT 6.1; rv:20.0) Gecko/20100101 Firefox/20.0')]
matches = findall(exp, opener.open(url).read())
for match in matches:
while matches.count(match) != 1:
matches.remove(match)
return matches

def fetch_image(url):
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0 (Windows NT 6.1; rv:20.0) Gecko/20100101 Firefox/20.0')]
return opener.open(url).read()

def usage():
print "Usage: mangadownloader.py [options]"
print "Options: \n\n"

print "-starturl=http://www.mangareader.net/311/bloody-monday.html \t download the manga for which the table of contents is at the given mangareader URL (Bloody Monday in this case)\n"
print "-outputpath=/path/to/output/directory \t Write the manga into this directory\n"
print "-subdir=BloodyMonday \t In the output path listed above, create a subdirectory titled 'BloodyMonday' to put the manga into (optional)\n"
print "-mergechapters \t Merge the pages of all chapters into one chapter (optional)\n"
print "-help or -h \t Display this message"

exit()

if "-help" in argv or "-h" in argv:
usage()

arguments = {}

for arg in argv:

if arg.startswith("-starturl="):
arguments["starturl"] = arg[len("-starturl="):]

elif arg.startswith("-outputpath="):
arguments["outputpath"] = arg[len("-outputpath="):]

elif arg.startswith("-subdir="):
arguments["subdir"] = arg[len("-subdir="):]


for required_argument in ["starturl", "outputpath"]:
if not arguments.has_key(required_argument):
usage()

if "-mergechapters" in argv:
arguments["mergechapters"] = True
else:
arguments["mergechapters"] = False

if arguments.has_key("subdir"):
arguments["outputpath"] += "/" + arguments["subdir"]
try:
mkdir(arguments["outputpath"])
except: # directory already exists
pass

chapters = get_matches(arguments["starturl"], chapter_exp)
for chapter in range(len(chapters)):
chapters[chapter] = "http://www.mangareader.net" + chapters[chapter] # convert to absolute URLs

for chapter in chapters:

if not arguments["mergechapters"]:
path = arguments["outputpath"] + "/" + str(chapter_number)
try:
mkdir(path)
except:
pass
chapter_number += 1
else:
path = arguments["outputpath"]


pages = get_matches(chapter, page_exp)
for page in pages:
image = fetch_image(get_matches("http://www.mangareader.net" + page, img_exp)[0])
open(path + "/" + str(page_number) + ".jpg", "wb").write(image)
page_number += 1
if not arguments["mergechapters"]:
page_number = 1

raw_input("Done. Press [enter] to quit: ")

Usage:

-starturl=http://www.mangareader.net/326/code-geass-lelouch-of-the-rebellion.html (This downloads the manga which is at the specified URL - Code Geass in this case. Obviously it must be a URL at mangareader.net.)

-outputpath=C:/Users/Me/path/to/output/directory (Specify the directory the manga chapters and pages are to be written into.)

-subdir=CodeGeass (Creates a subdirectory with the specified name - again, "CodeGess" in this case - in the directory specified in the option above, and the manga will go into it. This is optional.)

-mergechapters (Rather than organizing the manga into chapters, all the pages will be written into same directory. This is obviously optional, but not recommended by me, and I don't see why anyone would want to do this for anything but the shortest of mangas.)

The chapters will be numbered from 1 to [insert the number of chapters the given manga has, here] and the pages in each chapter will be numbered from 1 to [insert the number of pages the given chapter has, here].
« Last Edit: March 20, 2014, 02:38:12 am by lucid »

Offline Zesh

  • Royal Highness
  • ****
  • Posts: 699
  • Cookies: 42
    • View Profile
Re: [Python] Manga downloader
« Reply #1 on: March 19, 2014, 09:26:02 pm »
Oh, come on. Everyone knows that MangaStream is the best manga site :P

+1.

Offline flowjob

  • Knight
  • **
  • Posts: 327
  • Cookies: 46
  • Pastafarian
    • View Profile
Re: [Python] Manga downloader
« Reply #2 on: March 19, 2014, 09:47:09 pm »
That's a coincidence. I am writing one myself for mangafox right now...  :P
Also using Python. Only have to fix the downloading, as the links are relative in some mangas.
Probably will post mine pretty soon too.
Quote
<phil> I'm gonna DDOS the washing machine with clothes packets.
<deviant_sheep> dont use too much soap or youll cause a bubble overflow

Offline lucid

  • #Underground
  • Titan
  • **
  • Posts: 2683
  • Cookies: 243
  • psychonaut
    • View Profile
Re: [Python] Manga downloader
« Reply #3 on: March 20, 2014, 02:39:00 am »
Added python tags to the code.

Code: (text) [Select]
[code=python] 
« Last Edit: March 20, 2014, 02:40:18 am by lucid »
"Hacking is at least as much about ideas as about computers and technology. We use our skills to open doors that should never have been shut. We open these doors not only for our own benefit but for the benefit of others, too." - Brian the Hacker

Quote
15:04  @Phage : I'm bored of Python