chia PDF thành nhiều tệp trong C#

Chúng tôi có dịch vụ C# Windows hiện đang xử lý tất cả các tệp PDF bằng cách đọc mã vạch 2D trên PDF bằng cách sử dụng thành phần của bên thứ 3 và sau đó cập nhật cơ sở dữ liệu và lưu trữ tài liệu trong Kho lưu trữ tài liệu.chia PDF thành nhiều tệp trong C#

Có cách nào tôi có thể cắt tệp sau khi đọc mã vạch và lưu trữ dưới dạng tài liệu khác không?

Ví dụ: nếu có tài liệu 10 trang, tài liệu sẽ được chia thành 10 tệp khác nhau.

Cảm ơn.

Nguồn

2010-08-26 acadia

Hiện tại bạn có đang sử dụng bất kỳ thư viện PDF nào không? – Marko

Sự hiểu biết của tôi rằng thành phần bên thứ 3 chỉ được sử dụng để phát hiện mã vạch bên trong tệp PDF. – gyurisc

Bạn có thể sử dụng thư viện PDF như PDFSharp, đọc tệp, lặp qua từng trang, thêm chúng vào tài liệu PDF mới và lưu chúng trên hệ thống tệp. Sau đó, bạn có thể xóa hoặc giữ nguyên bản gốc.

Đó là một chút mã, nhưng rất đơn giản và những mẫu này sẽ giúp bạn bắt đầu.

http://www.pdfsharp.net/wiki/Default.aspx?Page=ConcatenateDocuments-sample&NS=&AspxAutoDetectCookieSupport=1

Nguồn

2010-08-26 11:50:37 Marko

Một câu hỏi trước câu trả lời của bạn phần - làm thế nào để phân chia các tài liệu pdf, nếu bạn biết nơi mà các mã vạch này sau đó bạn có thể chia nhỏ các tài liệu một cách dễ dàng:

How can I split up a PDF file into pages (preferably C#)

Kiến nghị được một thư viện gọi PDFSharp và một sample demonstrating PDF splitting.

Nguồn

2010-08-26 11:48:43 gyurisc

tôi gặp cùng một câu hỏi, bạn có thể sử dụng itextsharp component công cụ để phân chia các tài liệu

public Split(String[] args) 
    { 
     if (args.Length != 4) 
     { 
      Console.Error.WriteLine("This tools needs 4 parameters:\njava Split srcfile destfile1 destfile2 pagenumber"); 
     } 
     else 
     { 
      try 
      { 
       int pagenumber = int.Parse(args[3]); 

       // we create a reader for a certain document 
       PdfReader reader = new PdfReader(args[0]); 
       // we retrieve the total number of pages 
       int n = reader.NumberOfPages; 
       Console.WriteLine("There are " + n + " pages in the original file."); 

       if (pagenumber < 2 || pagenumber > n) 
       { 
        throw new DocumentException("You can't split this document at page " + pagenumber + "; there is no such page."); 
       } 

       // step 1: creation of a document-object 
       Document document1 = new Document(reader.GetPageSizeWithRotation(1)); 
       Document document2 = new Document(reader.GetPageSizeWithRotation(pagenumber)); 
       // step 2: we create a writer that listens to the document 
       PdfWriter writer1 = PdfWriter.GetInstance(document1, new FileStream(args[1], FileMode.Create)); 
       PdfWriter writer2 = PdfWriter.GetInstance(document2, new FileStream(args[2], FileMode.Create)); 
       // step 3: we open the document 
       document1.Open(); 
       PdfContentByte cb1 = writer1.DirectContent; 
       document2.Open(); 
       PdfContentByte cb2 = writer2.DirectContent; 
       PdfImportedPage page; 
       int rotation; 
       int i = 0; 
       // step 4: we add content 
       while (i < pagenumber - 1) 
       { 
        i++; 
        document1.SetPageSize(reader.GetPageSizeWithRotation(i)); 
        document1.NewPage(); 
        page = writer1.GetImportedPage(reader, i); 
        rotation = reader.GetPageRotation(i); 
        if (rotation == 90 || rotation == 270) 
        { 
         cb1.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(i).Height); 
        } 
        else 
        { 
         cb1.AddTemplate(page, 1f, 0, 0, 1f, 0, 0); 
        } 
       } 
       while (i < n) 
       { 
        i++; 
        document2.SetPageSize(reader.GetPageSizeWithRotation(i)); 
        document2.NewPage(); 
        page = writer2.GetImportedPage(reader, i); 
        rotation = reader.GetPageRotation(i); 
        if (rotation == 90 || rotation == 270) 
        { 
         cb2.AddTemplate(page, 0, -1f, 1f, 0, 0, reader.GetPageSizeWithRotation(i).Height); 
        } 
        else 
        { 
         cb2.AddTemplate(page, 1f, 0, 0, 1f, 0, 0); 
        } 
        Console.WriteLine("Processed page " + i); 
       } 
       // step 5: we close the document 
       document1.Close(); 
       document2.Close(); 
      } 
      catch(Exception e) 
      { 
       Console.Error.WriteLine(e.Message); 
       Console.Error.WriteLine(e.StackTrace); 
      } 
     } 

    }

Nguồn

2012-11-08 02:11:09 user1788053

Bài đăng đầu tiên tuyệt vời! Tiếp tục và chào đón! ;-) –

Mã này được dựa trên thư viện PDFSharp

http://www.pdfsharp.com/PDFsharp/

Nếu bạn muốn để chia cho Dấu sách thì đây là mã.

public static void SplitPDFByBookMark(string fileName) 
    { 
     string sInFile = fileName; 
     PdfReader pdfReader = new PdfReader(sInFile); 
     try 
     { 
      IList<Dictionary<string, object>> bookmarks = SimpleBookmark.GetBookmark(pdfReader); 

      for (int i = 0; i < bookmarks.Count; ++i) 
      { 
       IDictionary<string, object> BM = (IDictionary<string, object>)bookmarks[0]; 
       IDictionary<string, object> nextBM = i == bookmarks.Count - 1 ? null : bookmarks[i + 1]; 

       string startPage = BM["Page"].ToString().Split(' ')[0].ToString(); 
       string startPageNextBM = nextBM == null ? "" + (pdfReader.NumberOfPages + 1) : nextBM["Page"].ToString().Split(' ')[0].ToString(); 
       SplitByBookmark(pdfReader, int.Parse(startPage), int.Parse(startPageNextBM), bookmarks[i].Values.ToArray().GetValue(0).ToString() + ".pdf", fileName); 

      } 
     } 
     catch (Exception ex) 
     { 
      throw ex; 
     } 
    } 
    private static void SplitByBookmark(PdfReader reader, int pageFrom, int PageTo, string outPutName, string inPutFileName) 
    { 
     Document document = new Document(); 
     FileStream fs = new System.IO.FileStream(System.IO.Path.GetDirectoryName(inPutFileName) + '\\' + outPutName, System.IO.FileMode.Create); 

     try 
     { 

      PdfWriter writer = PdfWriter.GetInstance(document, fs); 
      document.Open(); 
      PdfContentByte cb = writer.DirectContent; 
      //holds pdf data 
      PdfImportedPage page; 
      if (pageFrom == PageTo && pageFrom == 1) 
      { 
       document.NewPage(); 
       page = writer.GetImportedPage(reader, pageFrom); 
       cb.AddTemplate(page, 0, 0); 
       pageFrom++; 
       fs.Flush(); 
       document.Close(); 
       fs.Close(); 

      } 
      else 
      { 
       while (pageFrom < PageTo) 
       { 
        document.NewPage(); 
        page = writer.GetImportedPage(reader, pageFrom); 
        cb.AddTemplate(page, 0, 0); 
        pageFrom++; 
        fs.Flush(); 
        document.Close(); 
        fs.Close(); 
       } 
      } 
     } 
     catch (Exception ex) 
     { 
      throw ex; 
     } 
     finally 
     { 
      if (document.IsOpen()) 
       document.Close(); 
      if (fs != null) 
       fs.Close(); 
     } 

    }

Nguồn

2013-02-11 10:59:56

vui lòng cung cấp thêm thông tin về giải pháp này. OP sẽ phải thay đổi mã này như thế nào để chia nhỏ PDF theo từng trang? mã này sử dụng thư viện nào? – andr

public int ExtractPages(string sourcePdfPath, string DestinationFolder) 
     { 
      int p = 0; 
      try 
      { 
       iTextSharp.text.Document document; 
       iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(new iTextSharp.text.pdf.RandomAccessFileOrArray(sourcePdfPath), new ASCIIEncoding().GetBytes("")); 
       if (!Directory.Exists(sourcePdfPath.ToLower().Replace(".pdf", ""))) 
       { 
        Directory.CreateDirectory(sourcePdfPath.ToLower().Replace(".pdf", "")); 
       } 
       else 
       { 
        Directory.Delete(sourcePdfPath.ToLower().Replace(".pdf", ""), true); 
        Directory.CreateDirectory(sourcePdfPath.ToLower().Replace(".pdf", "")); 
       } 

       for (p = 1; p <= reader.NumberOfPages; p++) 
       { 
        using (MemoryStream memoryStream = new MemoryStream()) 
        { 
         document = new iTextSharp.text.Document(); 
         iTextSharp.text.pdf.PdfWriter writer = iTextSharp.text.pdf.PdfWriter.GetInstance(document, memoryStream); 
         writer.SetPdfVersion(iTextSharp.text.pdf.PdfWriter.PDF_VERSION_1_2); 
         writer.CompressionLevel = iTextSharp.text.pdf.PdfStream.BEST_COMPRESSION; 
         writer.SetFullCompression(); 
         document.SetPageSize(reader.GetPageSize(p)); 
         document.NewPage(); 
         document.Open(); 
         document.AddDocListener(writer); 
         iTextSharp.text.pdf.PdfContentByte cb = writer.DirectContent; 
         iTextSharp.text.pdf.PdfImportedPage pageImport = writer.GetImportedPage(reader, p); 
         int rot = reader.GetPageRotation(p); 
         if (rot == 90 || rot == 270) 
         { 
          cb.AddTemplate(pageImport, 0, -1.0F, 1.0F, 0, 0, reader.GetPageSizeWithRotation(p).Height); 
         } 
         else 
         { 
          cb.AddTemplate(pageImport, 1.0F, 0, 0, 1.0F, 0, 0); 
         } 
         document.Close(); 
         document.Dispose(); 
         File.WriteAllBytes(DestinationFolder + "/" + p + ".pdf", memoryStream.ToArray()); 
        } 
       } 
       reader.Close(); 
       reader.Dispose(); 
      } 
      catch 
      { 
      } 
      finally 
      { 
       GC.Collect(); 
      } 
      return p - 1; 

     }

gọi hàm này có bao giờ bạn muốn và vượt qua các nguồn và thư mục đích con đường

Nguồn

2013-09-24 05:59:31

yên tĩnh ok, nhưng nó có vấn đề với các trang phong cảnh. – Erfan

public void SplitPDFByBookMark(string fileName) 
    { 
     string sInFile = fileName; 
     var pdfReader = new PdfReader(sInFile); 
     try 
     { 
      IList<Dictionary<string, object>> bookmarks = SimpleBookmark.GetBookmark(pdfReader); 

      for (int i = 0; i < bookmarks.Count; ++i) 
      { 
       IDictionary<string, object> BM = (IDictionary<string, object>)bookmarks[i]; 
       IDictionary<string, object> nextBM = i == bookmarks.Count - 1 ? null : bookmarks[i + 1]; 

       string startPage = BM["Page"].ToString().Split(' ')[0].ToString(); 
       string startPageNextBM = nextBM == null ? "" + (pdfReader.NumberOfPages + 1) : nextBM["Page"].ToString().Split(' ')[0].ToString(); 
       SplitByBookmark(pdfReader, int.Parse(startPage), int.Parse(startPageNextBM), bookmarks[i].Values.ToArray().GetValue(0).ToString() + ".pdf", fileName); 

      } 
     } 
     catch (Exception ex) 
     { 
      throw ex; 
     } 
    } 

    private void SplitByBookmark(PdfReader reader, int pageFrom, int PageTo, string outPutName, string inPutFileName) 
    { 
     Document document = new Document(); 
     using (var fs = new FileStream(Path.GetDirectoryName(inPutFileName) + '\\' + outPutName, System.IO.FileMode.Create)) 
     { 
      try 
      { 
       using (var writer = PdfWriter.GetInstance(document, fs)) 
       { 
        document.Open(); 
        PdfContentByte cb = writer.DirectContent; 
        //holds pdf data 
        PdfImportedPage page; 
        if (pageFrom == PageTo && pageFrom == 1) 
        { 
         document.NewPage(); 
         page = writer.GetImportedPage(reader, pageFrom); 
         cb.AddTemplate(page, 0, 0); 
         pageFrom++; 
         fs.Flush(); 
         document.Close(); 
         fs.Close(); 

        } 
        else 
        { 
         while (pageFrom < PageTo) 
         { 
          document.NewPage(); 
          page = writer.GetImportedPage(reader, pageFrom); 
          cb.AddTemplate(page, 0, 0); 
          pageFrom++; 
          fs.Flush(); 
          document.Close(); 
          fs.Close(); 
         } 
        } 
       } 
       //PdfWriter writer = PdfWriter.GetInstance(document, fs); 

      } 
      catch (Exception ex) 
      { 
       throw ex; 
      } 
     } 
    }

Bạn có thể cài đặt itextsharp từ NuGet và sao chép và dán mã này trong aC# ứng dụng cuộc gọi phương thức SplitPDFByBookMark() và chuyển tên tệp pdf. Mã này sẽ tìm kiếm các dấu trang của bạn và hoàn tất!

Nguồn

2018-01-16 14:45:47

** Cảm ơn Milo. ** –

chia PDF thành nhiều tệp trong C#

Trả lời

Các vấn đề liên quan