In this tutorial, I will be showing you how to extract specific pages (or split specific pages) from a PDF file and save those pages as a separate PDF using Python.
Before we dive into tutorial, you will need to install PyPDF2 library (pip install PyPDF2).
Buy Me a Coffee? Your support is much appreciated!
PayPal Me: https://www.paypal.me/jiejenn/5
Venmo: @Jie-Jenn
Source Code:
from PyPDF2 import PdfFileReader, PdfFileWriter
pdf_file_path = 'Unknown.pdf'
file_base_name = pdf_file_path.replace('.pdf', '')
pdf = PdfFileReader(pdf_file_path)
pages = [0, 2, 4] # page 1, 3, 5
pdfWriter = PdfFileWriter()
for page_num in pages:
pdfWriter.addPage(pdf.getPage(page_num))
with open('{0}_subset.pdf'.format(file_base_name), 'wb') as f:
pdfWriter.write(f)
f.close()
Hello, I watched your video on youtube. Your videos are amazing. I was curious, is there a way to get this script to to use file names in a column from a .csv file as variable inputs ?
My preferred method is use either CSV module or Pandas library to read the data from a CSV file and store the dataset into an array, them I can iterate each row of values accordingly.
Hello can you help me to add paragraph in particular page word using python
Hello,
How about this:
I have a 187 pages pdf file.
I want to prompt the user to select the page count/number to split the file by, e.g. 20, 45, xyx pages.
I want to know how I would write this..
I can write the code if the number of pages is divisible by 2 with no remainders. My issue is, how do I work with an odd pages file… 197-page file. Split into 20 pages file chunks, then if the last chunk of pages is less than 8 just added to the last file;l making the last one 28 pages. Or if the last chunk of pages is larger than 10, save that last page group as its own file as well as e just did with the 20 pages chunk.
Here:
source file: file1.pdf
total pages: 25
split the doc by 7 pages chunk file
file1: 7 pages
file2: 7 pages
file3: 10 pages
or:
filex.pdf
pages: 800
split chunks: 28 pages
total files: 28 files of 28 pages each
remaining pages: 16 pages ** these 16 pages should be also saved as a new file. if on the other hand, the remaining pages are less than 10, then the remaining pages should be combined with the previous 20-page chunk; as the final split.
Note: all these numbers must be dynamically entered at run time, because we have no idea how big the source file is, and have no idea what the split page count will be.
I have multi tables one pdf file, I want to extract each table with its page numbers using camelot? can you help me?
thank you for this amazing source code!