Open Source Java Text Processing Software

Browse free open source Java Text Processing Software and projects below. Use the toggles on the left to filter open source Java Text Processing Software by OS, license, language, programming language, and project status.

  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Cloud tools for web scraping and data extraction Icon
    Cloud tools for web scraping and data extraction

    Deploy pre-built tools that crawl websites, extract structured data, and feed your applications. Reliable web data without maintaining scrapers.

    Automate web data collection with cloud tools that handle anti-bot measures, browser rendering, and data transformation out of the box. Extract content from any website, push to vector databases for RAG workflows, or pipe directly into your apps via API. Schedule runs, set up webhooks, and connect to your existing stack. Free tier available, then scale as you need to.
    Explore 10,000+ tools
  • 1
    A Swiss Army Knife GUI application for PDF documents: combine, split, rotate, reorder (n-up, booklet), watermark, edit bookmarks/fileinfo/pagetransition, compress, encrypt, decrypt, sign, repair, edit attachments and more.
    Leader badge
    Downloads: 98 This Week
    Last Update:
    See Project
  • 2
    XSLT syntax highlighting

    XSLT syntax highlighting

    Java based XSLT Processor extension for syntax highlighting

    Please note that project moved to GitHub: https://github.com/xmlark/xslthl This is an implementation of syntax highlighting as an extension module for XSLT processors (Xalan, Saxon), so if you have e.g. article about programming written in DocBook, code examples can be automatically syntax highlighted during the XSLT processing phase.
    Leader badge
    Downloads: 111 This Week
    Last Update:
    See Project
  • 3
    iText®, a JAVA PDF library

    iText®, a JAVA PDF library

    PDF Library for Developers

    iText is an open-source PDF library available for Java and .NET (C#). iText allows you to effortlessly generate and manipulate standards-compliant PDF documents with a powerful and feature-rich SDK. With iText, you can create archivable and accessible PDFs, split and merge documents, fill and flatten forms, digitally sign documents, and more. iText add-ons enable additional functionality, such as PDF creation from HTML templates, secure redaction, OCR, and much more. The latest versions of iText build on the success of previous versions and feature an improved document engine, high and low-level programming capabilities, and a more efficient modular structure. iText represents the next level for developers looking to leverage PDF in document workflows. The main project page for iText is now on GitHub, and all the latest releases, code samples, open source add-ons and tools, etc. can be found at https://github.com/itext/.
    Leader badge
    Downloads: 191 This Week
    Last Update:
    See Project
  • 4
    The DITA Open Toolkit is an implementation of the OASIS DITA XML Specification. The Toolkit transforms DITA content into many deliverable formats. See https://www.dita-ot.org/ for documentation and links to downloads. The source code and issue trackers have been moved to https://github.com/dita-ot/dita-ot
    Downloads: 30 This Week
    Last Update:
    See Project
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • 5

    ConcatPDF

    PDF Concatenation Tool

    ConcatPDF is the tool to concatenate PDF files. It can concatenate, extract, encrypt, decrypt, configure PDF files, convert image files to PDF. GUI version and CUI version are both available. iText.NET is iText porting on .NET Framework by J#. This library allows you to generate PDF, (X)HTML, XML, RTF files on Microsoft.NET Framework including ASP.NET.
    Downloads: 30 This Week
    Last Update:
    See Project
  • 6
    Jericho HTML Parser is a java library allowing analysis and manipulation of parts of an HTML document, including server-side tags, while reproducing verbatim any unrecognised or invalid HTML.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    Jaxe
    Jaxe is a free Java XML editor with a configurable GUI, using XML schemas for validation and XSL for exports in HTML or XML.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    RTextDoc

    RTextDoc

    An editor for structured documents

    RTextDoc is an editor for structured text documents such as LaTeX, AsciiDoc, DocBook. RTextDoc has proofreading capabilities: on-the-fly spelling, instant grammar checking and built-in free dictionaries. RTextDoc has syntax highlighting, bracket matching, folding, document structure browser for sections and labels, bookmarks, manager for LaTeX symbols, an editor for mathematical equations,integrated BibTeX database manager and several tools to convert LaTeX to HTML and back. AsciiDoc files can be converted to DocBook, HTML and PDF files.
    Leader badge
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9

    xmlj

    XMLJ is a Java XML Editor and validator project.

    XMLJ is a Java XML Editor and validator project.
    Downloads: 9 This Week
    Last Update:
    See Project
  • Zendesk: The Complete Customer Service Solution Icon
    Zendesk: The Complete Customer Service Solution

    Discover AI-powered, award-winning customer service software trusted by 200k customers

    Equip your agents with powerful AI tools and workflows that boost efficiency and elevate customer experiences across every channel.
    Learn More
  • 10
    HTMLtools includes several Java HTML tools for preparing Web pages. The HTMLtools program automates batch conversion of tab-delimited spreadsheet text files to HTML Web-page files, file & table editing, keyword mapping, templates, and more.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 11
    NRtfTree library is a set of classes written entirely in C# which may be used to manage (read and write) RTF documents in your own applications. A java port of the library can be found in http://www.sgoliver.net/blog/?page_id=92
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    This is an Eclipse xml editor with several edition capabilities. The main features concern the interaction with the classes and resources declared in xml (Open class/resource, Create class), similar to the interaction between classes in java editor.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    DITA2wiki is a toolkit that enables you to publish DITA content (maps and topics) to a wiki.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14

    Change File Encoding

    Change encoding of text files.

    Change File Encoding is a utility that allows you to change the encoding of text files. For example, files saved in US-ASCII can be converted to UTF-8. Over 170 encodings are supported. Requires Java 1.8 or higher.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    A stand-alone editor using Mediawiki markup language to generate HTML code. You can create and preview pages written using Mediawiki markup (i.e. Wikipedia pages) while off-line.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    CPLed is an OpenSIPS tool for editing CPL scripts in a friendly and easy graphical way. It can be used as a standalone application or embedded in a web page as applet. It also provide CPL script transport functionalities via SIP and HTTP protocols.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Java csveditor
    A CSV editor written in Java 1.6 Developed using NetBeans IDE 6.0.1
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Any2XML allows the controlled conversion from any textfile to an XML file. In the editor (GUI), rules are added to define which parts of the textfile are processed and where they are put in the resulting XML file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    The DITA Open Platform is a free, open-source project which goal is to provide an enterprise platform for the edition, management and processing of DITA documents.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20

    Detexter

    Detexter is an app designed to extract text from PDF files.

    Detexter lets you extract text from multiple PDF files. Detexter uses the PDFBox library for its text extraction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    The DocBook Publishing Utilities tools, which make creation and publishing of DocBook easier. The tools are: Maven plug-in to Transform HTML into XML (use after docbkx); Eclipse DocBook table editor; Eclipse wizards for initial DocBook files.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Doco is a simple but feature rich and powerful markup language for converting text documents into highly-presentable and navigable web content.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    EasyCite is an easy to use, complete citation project that will assist researchers and students in generating bibliographies and footers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Prototype for a framework and user interface for combining various structured search and document clustering techniques.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    FigTeX manages images and their easy inclusion in LaTeX documents. Similar to BibTex, the image information is stored in an external file and is imported into the document as needed. It comes with a comfortable GUI for managing the image library.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next