Very nice reflection,

It's always good to experiment with new approaches and ideas.

However, When Bert was created by Google, they make the choice to only encoder layers, as on the other side GPT is based only on decoder layers. Basically, this choice depends on learning objectives that do not need a decoder. So, did you really need a decoder in your case? I don't think.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Abdelkader Rhouati

Data scientist & Ph.D. researcher on AI. My area of expertise is around Deep Learning, NLP, and XAI — https://abdelkader-rhouati.medium.com/membership