Parameters in doc2vec
Here are some parameter in gensim’s doc2vec class.
window
window is the maximum distance between the predicted word and context words used for prediction within a document. It will look behind and ahead.
In skip-gram model, if the window size is 2, the training samples will be this:(the blue word is the input word)

min_count
If the word appears less than this value, it will be skipped
sample
High frequency word like the is useless for training. sample is a threshold for deleting these higher-frequency words. The probability of keeping the word \(w_i\) is:
