Skip to content

Commit

Permalink
Feature: Support manual redaction (#2433)
Browse files Browse the repository at this point in the history
# Description

## Manual Redaction:

- ### Text Selection-based redaction:
-
![image](https://github.com/user-attachments/assets/e59c5e6c-ef52-4f54-a98e-fc26e3226c8e)
- Users can now redact currently selected text by selecting the text
then clicking `ctrl + s` shortcut or by pressing on **apply/save/disk
icon** in the toolbar.
- Users can delete/cancel the redacted area by clicking on the box
containing the text, then clicking on `delete/trash` icon or by using
the shortcut `delete`.
- Users can customize the color of the redacted area/text (after the
redaction was applied) by simply clicking on the box containing the
text/area then clicking on the `color palette` icon and choosing the
color they want.
- Users can choose to select the color of redaction before redacting
text or applying changes (this only affects newly created redaction
areas, to change the color of an existing one; check the previous bullet
point).


- ### Draw/Area-based redaction:
-
![image](https://github.com/user-attachments/assets/e2968ae3-ebaf-497e-b3bd-0c8c8f4ee157)
- Users can now redact an area in the page by selecting the then
clicking `ctrl + s` shortcut or by pressing on **apply/save/disk icon**
in the toolbar.
- Users can delete/cancel the redacted area by clicking on the drawn
box, then clicking on `delete/trash` icon or by using the shortcut
`delete` (requires temporarily turning off drawing mode).
- Users can customize the color of the redacted area (after the
redaction was applied) by simply clicking on the box containing the area
then clicking on the `color palette` icon and choosing the color they
want.
- Users can choose to select the color of redaction before drawing the
box or applying changes (this only affects newly created redaction
areas, to change the color of an existing one; check the previous bullet
point).

 - ### Page-based redaction:
-
![image](https://github.com/user-attachments/assets/aba59432-23e7-4fe6-aa28-872c23a91242)
- Users can now redact **ENTIRE** pages by specifying the page
number(s), range(s) or functions.
- Users can customize the color of page-based redaction (doesn't affect
text-based nor draw-based redactions).

### Redaction modes:
There are three modes of redaction/operation currently supported
  - Text Selection-based redaction (TEXT)
  - Draw/Area-based redaction (DRAWING)
  - None - by simply not choosing any of the above modes (NONE).

## How to use:

- **Text Selection-based redaction:** click on this icon in the toolbar
![image](https://github.com/user-attachments/assets/52cc31ef-6946-482c-84a2-1ddb79646dd8)
to enable `text-selection redaction mode` then select the text you want
to redact then press `ctrl + s` or click on the disk/save icon
![image](https://github.com/user-attachments/assets/f2bdf2f2-ee07-4682-bb9a-95e13a1004cf).
- **Draw/Area-based redaction:** click on this icon in the toolbar
![image](https://github.com/user-attachments/assets/fe00dca9-761e-47a0-a748-2041830dc73e)
to enable `draw/area-based redaction` then `left mouse click (LMB)` on
the starting point of the rectangle, then once you are satisfied with
the rectangle's placement/dimensions then `left mouse click (LMB)` again
to apply the redaction.
- **Example:** `Left mouse click (LMB)` then move mouse to the right
then bottom then `Left mouse click (LMB)`.
- Note: Red box/rectangle borders indicate that you have not yet saved
(you need to left click on the page to save)
![image](https://github.com/user-attachments/assets/5ce5f789-9d6f-4984-8555-e8fef2a3e3cc)
once saved the borders will become green
![image](https://github.com/user-attachments/assets/85cabb9f-e7ee-4268-90cd-80493b625466)
(they also become clickable/hover-able when drawing mode is off).
- **Page-based redactions:**: Insert the page number(s), range(s) and/or
functions (separated by `,`) then select your preferred color and click
on `Redact` to submit.

![image](https://github.com/user-attachments/assets/ed8a0a98-32b2-4ae1-a3c7-c54bfe0fea66)

- **Color Customizations:**
- You can change the redaction color for new redactions by clicking on
this icon in the toolbar
![image](https://github.com/user-attachments/assets/bad573ee-0545-4329-b131-2022f970f134).
- You can change the redaction color for existing redactions by hovering
over the redaction box then clicking on it (`Left mouse click LMB`) then
clicking on color palette (highlighted in red in the picture)
![image](https://github.com/user-attachments/assets/22281a81-2cd9-4771-9a93-a75b6dd93433)
then select your preferred color.

- **Deletions:**
- You can delete a redacted area by hovering over the redaction box then
clicking on it (`Left mouse click LMB`) then clicking on the trash icon
(highlighted in red in the picture)
![image](https://github.com/user-attachments/assets/f0347279-8211-4b1c-a91d-c1fcb929cc5d).

## Card in the home page:

![image](https://github.com/user-attachments/assets/b3fb16eb-5ff0-4548-9f22-b1b8fe162c8b)

Closes #465

## Checklist

- [x] I have read the [Contribution
Guidelines](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/CONTRIBUTING.md)
- [x] I have performed a self-review of my own code
- [x] I have attached images of the change if it is UI based
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] If my code has heavily changed functionality I have updated
relevant docs on [Stirling-PDFs doc
repo](https://github.com/Stirling-Tools/Stirling-Tools.github.io/blob/main/docs/)
- [ ] My changes generate no new warnings
- [ ] I have read the section [Add New Translation
Tags](https://github.com/Stirling-Tools/Stirling-PDF/blob/main/HowToAddNewLanguage.md#add-new-translation-tags)
(for new translation tags only)

---------

Co-authored-by: Anthony Stirling <77850077+Frooodle@users.noreply.github.com>
  • Loading branch information
omar-ahmed42 and Frooodle authored Jan 6, 2025
1 parent 79f6598 commit 22af79a
Show file tree
Hide file tree
Showing 14 changed files with 7,549 additions and 9 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,7 @@ public void init() {
addEndpointToGroup("Security", "remove-cert-sign");
addEndpointToGroup("Security", "sanitize-pdf");
addEndpointToGroup("Security", "auto-redact");
addEndpointToGroup("Security", "redact");

// Adding endpoints to "Other" group
addEndpointToGroup("Other", "ocr-pdf");
Expand Down Expand Up @@ -234,6 +235,7 @@ public void init() {
addEndpointToGroup("Java", "markdown-to-pdf");
addEndpointToGroup("Java", "show-javascript");
addEndpointToGroup("Java", "auto-redact");
addEndpointToGroup("Java", "redact");
addEndpointToGroup("Java", "pdf-to-csv");
addEndpointToGroup("Java", "split-by-size-or-count");
addEndpointToGroup("Java", "overlay-pdf");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,18 @@
import java.awt.*;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.Collections;
import java.util.List;

import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.PDPageContentStream;
import org.apache.pdfbox.pdmodel.PDPageTree;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.WebDataBinder;
import org.springframework.web.bind.annotation.InitBinder;
import org.springframework.web.bind.annotation.ModelAttribute;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestMapping;
Expand All @@ -21,12 +26,17 @@
import io.swagger.v3.oas.annotations.tags.Tag;

import lombok.extern.slf4j.Slf4j;

import stirling.software.SPDF.model.PDFText;
import stirling.software.SPDF.model.api.security.ManualRedactPdfRequest;
import stirling.software.SPDF.model.api.security.RedactPdfRequest;
import stirling.software.SPDF.model.api.security.RedactionArea;
import stirling.software.SPDF.pdf.TextFinder;
import stirling.software.SPDF.service.CustomPDDocumentFactory;
import stirling.software.SPDF.utils.GeneralUtils;
import stirling.software.SPDF.utils.PdfUtils;
import stirling.software.SPDF.utils.WebResponseUtils;
import stirling.software.SPDF.utils.propertyeditor.StringToArrayListPropertyEditor;

@RestController
@RequestMapping("/api/v1/security")
Expand All @@ -41,6 +51,107 @@ public RedactController(CustomPDDocumentFactory pdfDocumentFactory) {
this.pdfDocumentFactory = pdfDocumentFactory;
}

@InitBinder
public void initBinder(WebDataBinder binder) {
binder.registerCustomEditor(List.class, "redactions", new StringToArrayListPropertyEditor());
}

@PostMapping(value = "/redact", consumes = "multipart/form-data")
@Operation(summary = "Redacts areas and pages in a PDF document", description = "This operation takes an input PDF file with a list of areas, page number(s)/range(s)/function(s) to redact. Input:PDF, Output:PDF, Type:SISO")
public ResponseEntity<byte[]> redactPDF(@ModelAttribute ManualRedactPdfRequest request) throws IOException {
MultipartFile file = request.getFileInput();
List<RedactionArea> redactionAreas = request.getRedactions();

PDDocument document = pdfDocumentFactory.load(file);

PDPageTree allPages = document.getDocumentCatalog().getPages();

redactPages(request, document, allPages);
redactAreas(redactionAreas, document, allPages);

if (request.isConvertPDFToImage()) {
PDDocument convertedPdf = PdfUtils.convertPdfToPdfImage(document);
document.close();
document = convertedPdf;
}

ByteArrayOutputStream baos = new ByteArrayOutputStream();
document.save(baos);
document.close();

byte[] pdfContent = baos.toByteArray();
return WebResponseUtils.bytesToWebResponse(
pdfContent,
Filenames.toSimpleFileName(file.getOriginalFilename()).replaceFirst("[.][^.]+$", "")
+ "_redacted.pdf");
}

private void redactAreas(List<RedactionArea> redactionAreas, PDDocument document, PDPageTree allPages)
throws IOException {
Color redactColor = null;
for (RedactionArea redactionArea : redactionAreas) {
if (redactionArea.getPage() == null || redactionArea.getPage() <= 0
|| redactionArea.getHeight() == null || redactionArea.getHeight() <= 0.0D
|| redactionArea.getWidth() == null || redactionArea.getWidth() <= 0.0D)
continue;
PDPage page = allPages.get(redactionArea.getPage() - 1);

PDPageContentStream contentStream = new PDPageContentStream(
document, page, PDPageContentStream.AppendMode.APPEND, true, true);
redactColor = decodeOrDefault(redactionArea.getColor(), Color.BLACK);
contentStream.setNonStrokingColor(redactColor);

float x = redactionArea.getX().floatValue();
float y = redactionArea.getY().floatValue();
float width = redactionArea.getWidth().floatValue();
float height = redactionArea.getHeight().floatValue();

PDRectangle box = page.getBBox();

contentStream.addRect(x, box.getHeight() - y - height, width, height);
contentStream.fill();
contentStream.close();
}
}

private void redactPages(ManualRedactPdfRequest request, PDDocument document, PDPageTree allPages)
throws IOException {
Color redactColor = decodeOrDefault(request.getPageRedactionColor(), Color.BLACK);
List<Integer> pageNumbers = getPageNumbers(request, allPages.getCount());
for (Integer pageNumber : pageNumbers) {
PDPage page = allPages.get(pageNumber);

PDPageContentStream contentStream = new PDPageContentStream(
document, page, PDPageContentStream.AppendMode.APPEND, true, true);
contentStream.setNonStrokingColor(redactColor);

PDRectangle box = page.getBBox();

contentStream.addRect(0, 0, box.getWidth(), box.getHeight());
contentStream.fill();
contentStream.close();
}
}

private Color decodeOrDefault(String hex, Color defaultColor) {
Color color = null;
try {
color = Color.decode(hex);
} catch (Exception e) {
color = defaultColor;
}

return color;
}

private List<Integer> getPageNumbers(ManualRedactPdfRequest request, int pagesCount) {
String pageNumbersInput = request.getPageNumbers();
String[] parsedPageNumbers = pageNumbersInput != null ? pageNumbersInput.split(",") : new String[0];
List<Integer> pageNumbers = GeneralUtils.parsePageList(parsedPageNumbers, pagesCount, false);
Collections.sort(pageNumbers);
return pageNumbers;
}

@PostMapping(value = "/auto-redact", consumes = "multipart/form-data")
@Operation(
summary = "Redacts listOfText in a PDF document",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,12 @@ public String autoRedactForm(Model model) {
return "security/auto-redact";
}

@GetMapping("/redact")
public String redactForm(Model model) {
model.addAttribute("currentPage", "redact");
return "security/redact";
}

@GetMapping("/add-password")
@Hidden
public String addPasswordForm(Model model) {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
package stirling.software.SPDF.model.api.security;

import java.util.List;

import io.swagger.v3.oas.annotations.media.Schema;
import lombok.Data;
import lombok.EqualsAndHashCode;
import stirling.software.SPDF.model.api.PDFWithPageNums;

@Data
@EqualsAndHashCode(callSuper = true)
public class ManualRedactPdfRequest extends PDFWithPageNums {
@Schema(description = "A list of areas that should be redacted")
private List<RedactionArea> redactions;

@Schema(description = "Convert the redacted PDF to an image", defaultValue = "false")
private boolean convertPDFToImage;

@Schema(description = "The color used to fully redact certain pages")
private String pageRedactionColor;
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
package stirling.software.SPDF.model.api.security;

import io.swagger.v3.oas.annotations.media.Schema;
import lombok.Data;

@Data
public class RedactionArea {
@Schema(description = "The left edge point of the area to be redacted.")
private Double x;
@Schema(description = "The top edge point of the area to be redacted.")
private Double y;

@Schema(description = "The height of the area to be redacted.")
private Double height;
@Schema(description = "The width of the area to be redacted.")
private Double width;

@Schema(description = "The page on which the area should be redacted.")
private Integer page;

@Schema(description = "The color used to redact the specified area.")
private String color;
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
package stirling.software.SPDF.utils.propertyeditor;

import java.beans.PropertyEditorSupport;
import java.util.ArrayList;
import java.util.List;

import com.fasterxml.jackson.core.type.TypeReference;
import com.fasterxml.jackson.databind.DeserializationFeature;
import com.fasterxml.jackson.databind.ObjectMapper;

import lombok.extern.slf4j.Slf4j;
import stirling.software.SPDF.model.api.security.RedactionArea;

@Slf4j
public class StringToArrayListPropertyEditor extends PropertyEditorSupport {

private final ObjectMapper objectMapper = new ObjectMapper();

@Override
public void setAsText(String text) throws IllegalArgumentException {
if (text == null || text.trim().isEmpty()) {
setValue(new ArrayList<>());
return;
}
try {
objectMapper.configure(DeserializationFeature.ACCEPT_SINGLE_VALUE_AS_ARRAY, true);
TypeReference<ArrayList<RedactionArea>> typeRef = new TypeReference<ArrayList<RedactionArea>>() {
};
List<RedactionArea> list = objectMapper.readValue(text, typeRef);
setValue(list);
} catch (Exception e) {
log.error("Exception while converting {}", e);
throw new IllegalArgumentException(
"Failed to convert java.lang.String to java.util.List");
}
}
}
14 changes: 14 additions & 0 deletions src/main/resources/messages_en_GB.properties
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,7 @@ pages=Pages
loading=Loading...
addToDoc=Add to Document
reset=Reset
apply=Apply

legal.privacy=Privacy Policy
legal.terms=Terms and Conditions
Expand Down Expand Up @@ -474,6 +475,10 @@ home.autoRedact.title=Auto Redact
home.autoRedact.desc=Auto Redacts(Blacks out) text in a PDF based on input text
autoRedact.tags=Redact,Hide,black out,black,marker,hidden

home.redact.title=Manual Redaction
home.redact.desc=Redacts a PDF based on selected text, drawn shapes and/or selected page(s)
redact.tags=Redact,Hide,black out,black,marker,hidden,manual

home.tableExtraxt.title=PDF to CSV
home.tableExtraxt.desc=Extracts Tables from a PDF converting it to CSV
tableExtraxt.tags=CSV,Table Extraction,extract,convert
Expand Down Expand Up @@ -579,6 +584,15 @@ autoRedact.customPaddingLabel=Custom Extra Padding
autoRedact.convertPDFToImageLabel=Convert PDF to PDF-Image (Used to remove text behind the box)
autoRedact.submitButton=Submit

#redact
redact.title=Manual Redaction
redact.header=Manual Redaction
redact.submit=Redact
redact.pageBasedRedaction=Page-based Redaction
redact.convertPDFToImageLabel=Convert PDF to PDF-Image (Used to remove text behind the box)
redact.pageRedactionNumbers.title=Pages
redact.pageRedactionNumbers.placeholder=(e.g. 1,2,8 or 4,7,12-16 or 2n-1)
redact.redactionColor.title=Redaction Color

#showJS
showJS.title=Show Javascript
Expand Down
Loading

0 comments on commit 22af79a

Please sign in to comment.