Emotion Recognition (ER) presents a significant challenge in pattern recognition and is crucial for various Artificial Intelligence (AI) applications, from monitoring children with autism to enhancing video games and human-computer interactions. Facial features are essential for discerning human emotions, motivating this study to improve Facial Emotion detection accuracy, vital for real-time applications. This research explores the use of both full facial features and facial landmarks, suggesting that landmarks offer clearer emotional cues. A Convolutional Neural Network (CNN) is employed for feature extraction, optimized using techniques like Stochastic Gradient Descent (SGD). This study combines upper facial landmarks with full facial features to enhance detection accuracy. An adaptive Genetic Algorithm (GA) optimizes the CNN structure. The dual-input neural network architecture processes a full 48 × 48 facial image through a CNN and upper facial features through another input, which is then concatenated and fed into dense layers with L2 regularization. The model is trained and evaluated using various optimizers to find the best configuration, with performance metrics plotted and compared. This method allows detailed pixel information and focused landmark features to improve emotion detection accuracy. Datasets such as FER2013 and Extended Cohn-Kanade (CK+) are used for training and testing. Results show significant accuracy improvement with the proposed method, further enhanced by a highly optimized model structure and an adaptive SGD optimizer. Integrating adaptive learning and momentum terms minimizes model loss and speeds up convergence.