DOCK_BYTE

The DOCK_BYTE module provides tools for extracting text from PDF and TXT documents and enables interactive chat-based exploration of the extracted content using a language model. It leverages various libraries for document processing and integrates with Streamlit for a GUI-based interface.

Features

Extract text from PDF documents using PyMuPDF.
Perform OCR on PDF documents using Tesseract.
Extract text from TXT files.
Use a language model to chat with the content of the documents.
GUI support with Streamlit for interactive usage.

Installation

pip install DOCK_BYTE

Usage

from dock_byte import chat_with_doc

chat_with_doc("gemma:2b", "data.txt", use_gui=True)

Runing

streamlit CODE_FILE.py

License

This project is licensed under the MIT License - see the LICENSE file for details.

Repository

For more information and to contribute, please visit the GitHub repository.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
DOCK_BYTE.egg-info		DOCK_BYTE.egg-info
build/lib/dock_byte		build/lib/dock_byte
dist		dist
dock_byte		dock_byte
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DOCK_BYTE

Features

Installation

Usage

Runing

License

Repository

Demo

About

Releases

Packages

Languages

License

codebytemirza/DockByte

Folders and files

Latest commit

History

Repository files navigation

DOCK_BYTE

Features

Installation

Usage

Runing

License

Repository

Demo

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages