We will train a model for matching these question pairs. Let's start by importing the relevant libraries, as follows:
import sys
import os
import pandas as pd
import numpy as np
import string
import tensorflow as tf
Following is a function that takes a pandas series of text as input. Then, the series is converted to a list. Each item in the list is converted into a string, made lower case, and stripped of surrounding empty spaces. The entire list is converted into a NumPy array, to be passed back:
def read_x(x):
x = np.array([list(str(line).lower().strip()) for line in x.tolist()])
return x
Next up is a function that takes a pandas series as input, converts it to a list, and returns it as a NumPy array:
def read_y(y):
return np.asarray(y.tolist())
The next function splits the data for training and validation. Validation data is helpful to see how well...