Content indexing in Django using Apache Tika
For the Documents module of our new open-source Generic Intranet, we need to be able to extract the text content and metadata from various kinds of documents:
- PDF files
- Microsoft ...
For the Documents module of our new open-source Generic Intranet, we need to be able to extract the text content and metadata from various kinds of documents: