BNTRANSLIT is a deep learning based transliteration app for Bangla word.

Installation

pip install bntranslit

Dependency

  • pytorch 1.7.0 or 1.7.0+

NB: No GPU Needed. Totally CPU based

Pre-trained Model

Usage

from bntranslit import BNTransliteration
model_path = "bntranslit_model.pth"
bntrans = BNTransliteration(model_path)
word = "aami"
output = bntrans.predict(word, topk=10)
# output: ['আমি', 'আমী', 'অ্যামি', 'আমিই', 'এমি', 'আমির', 'আমিদ', 'আমই', 'আমে', 'আমিতে']

Datasets and Training Details

  • We used Google Dakshina Dataset
  • Thanks to AI4Bharat for providing training notebook with details explanation
  • We trained Google Bangla Dakshina lexicons train datasets for 10 epochs with batch size 128, 1e-3, embedding dim = 300, hidden dim = 512, lstm, used attention
  • We evaluated our trained model with Google Bangla Dakshina lexicon test data using AI4Bharat evaluation script and our evaluation results insides docs/evaluation_summary.txt

Suppose you have a large text corpora and you can’t process that large file in your small RAM computer.

Here is a solution for processing large corpora using python generator

class CorpusProcessing:
def __init__(self, data_path):
self.data_path = data_path

def __iter__(self):
for line in open(self.data_path):
# do your process here
# here I am doing white space tokenization
tokens = line.split()
yield tokens

process = CorpusProcessing('large_copora.txt')
for tokens in process:
print(tokens)

References

Thanks To

  • Faruk Ahmad vai for forcefully helping me learning python generator

BENDeep is a pytorch based deep learning solution for Bengali NLP Task like bengali translation, bengali sentiment analysis and so on.

https://github.com/sagorbrur/bendeep

Installation

pip install bendeep

Dependency

  • pytorch 1.5.0+

Pretrained Model

API

Sentiment Analysis

Analyzing Sentiment

This sentiment analysis model is a RNN based GRU model trained with socian sentiment dataset with loss 0.073 in…

Bengali language model is build with fastai’s ULMFit and ready for prediction and classfication task.

https://github.com/sagorbrur/bnlm

  • This tool mostly followed inltk
  • We separated Bengali part with better evaluation results

Installation

pip install bnlm

Dependencies

  • use pytorch >=1.0.0 and <=1.3.0

Evaluation Result

Language Model

  • Accuracy 48.26% on validation dataset
  • Perplexity: ~22.79

Features and API

Download pretrained Model

To start, first download pretrained Language…

Installation

$sudo apt-get install libfreetype6-dev libharfbuzz-dev libfribidi-dev gtk-doc-tools
$git clone https://github.com/python-pillow/Pillow.git
$cd Pillow/depends
$chmod +x install_raqm.sh
$./install_raqm.sh
$conda install pillow

Testing

import numpy as np
from PIL import ImageFont, ImageDraw, Image
import cv2
import time
## Make canvas and set the color
img = np.zeros((200,400,3),np.uint8)
b,g,r,a = 0,255,0,0
## Use bengali font to write bengali.
fontpath = “./Siyamrupali.ttf”
font = ImageFont.truetype(fontpath, 48)
img_pil = Image.fromarray(img)
draw = ImageDraw.Draw(img_pil)
draw.text((100, 80), u”মুক্তিযুদ্ধ”, font = font, fill = (b, g, r, a))
img = np.array(img_pil)
cv2.imwrite(“res2.png”, img)

References

https://stackoverflow.com/questions/50854235/how-to-draw-chinese-text-on-the-image-using-cv2-puttextcorrectly-pythonopen

https://github.com/python-pillow/Pillow/issues/3593

BERT(BIDIRECTIONAL ENCODER REPRESENTATION FROM TRANSFORMER)

NLP জগতে যেখানেই state of the art খুঁজেন না কেন সবার উপরে যে মডেলের নাম থাকে তা হলো BERT. (যদিও state of the art চার্ট এখন দ্রুত পরিবর্তনশীল, ফলে এই কথাটি এখন সঠিক নাও হতে পারে।) BERT হলো একটি ল্যাঙ্গুয়েজ মডেল যেটি মূলত উভয় দিক থেকে বাক্যের কনটেক্সট বা প্রসঙ্গ…

NB: To know about ROUGE check this link

Installing Perl and Rouge

sudo apt-get install perl
  • For installing XML:DOM(this is a requirement for ROUGE to work) we install synaptic package manager
sudo apt-get update
sudo apt-get install synaptic
  • Once Synaptic Package manager is installed, search for Synaptic package manager in your applications and launch…

You are an enthusiastic learner on Machine Learning but you have no capability to purchase a big machine to train and test your model.

Don’t worry!

Google Colab is a perfect solution for you. Google Colab(Colaboratory) is a free cloud service to encourage machine learning research.

Here is a simple introduction how…

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store