To evaluate the feedback, a basic technical understanding of large language models, the Genow application, and existing and future features and limitations of Genow is necessary. The feedback received can be evaluated particularly well if it is of high quality.
We would recommend encouraging users to provide continuous, comprehensive feedback via the feedback function in order to further develop the use case in terms of quality. This feedback can then be viewed via the admin panel. For a structured analysis of the feedback, we recommend processing it via CSV. To do this, you can use the ‘Download Feedback’ function in the Admin Panel.
Show introduction to the feedback dashboard in the admin panel
Navigate to the admin panel and select your use case. You will find yourself in the use case setting. On the right, you will find the “Feedback Dashboard” in the ‘Evaluate Use Case section”.On the top of the page, you will find the total amount of feedbacks and their distribution into positive and negative ratings over different time horizons, as well as the star rating in each case.On the button of the page, you can unfold the list of feedbacks received for the use case. By clicking “View”, you will be able to see all the relevant information provided with the feedback (filled-out fields of the feedback form as well as the user questions and answers from Genow).
Clusters are useful for evaluating feedback. Similar feedback is grouped together. The following table provides an overview of common clusters and how to address them.First, it’s a good idea to cluster negative and positive feedback. Positive user feedback describes how well users have adopted the technology and how well they have been trained. Negative feedback provides insight into potential for improvement.Please note that the following table refers to the evaluation of negative feedback:
Cluster Name
Description
Responsibility
Takeaway
Correct
Genow’s response was actually correct based on the connected data and the technical capabilities of language models, but feedback was provided.
Use Case
User training on expectation management or on the capabilities and limitations of language models may be necessary.
Information missing
The information is not available to the language model, but should be added.
Use Case
Update information
Missing Feature / will be fixed by updates
The issue will be corrected by Genow (Roadmap!)
Use Case / Genow
Update information
Use Case contacts Genow to submit a feature request. If there is a high degree of urgency, a project contract can be concluded for implementation. Sometimes the feature is already on Genow’s roadmap or it is a known problem that will be resolved with a short-term release.
Information inaccurate
Information in existing documents inaccurate / insufficiently formulated, which has a negative impact on the quality of the response.
Use Case
Update document, make sure the guidelines and recommendations for your data was matched (see Data Maintanace article).
Imprecise question / interaction in need of improvement
User input is sometimes crucial when complex information is to be extracted from documents.
Use Case
User training necessary
Information not output correctly
Information not extracted correctly, although it is available.
Use Case / Genow
Depending on the individual case. The files and their names should always be checked. In complex cases, pipeline optimisation is necessary. This can be carried out by Genow.
Available feature or optimisation of an existing feature would provide improvement
Issue or process during use could be fixed or improved by an available feature.
Use Case / Genow
Depending on the individual case.
Bug
Error message, demonstrably incorrect behaviour by Genow; ideally traceable.
Genow
Error message, time of the error message and description as well as screenshot to [email protected] for error correction.
All feedback should be reviewed. For this purpose, either a spreadsheet (e.g., an Excel file) or the existing feedback dashboard in the Genow admin panel can be used. For the evaluation of feedback, the query, Genow’s response, and the user comment should be recorded. Subsequently, the question should be assigned to one, or at most two, clusters. Additionally, a comment can be added to justify the cluster assignment.A possible structure results from the following example table:
Question
Answer from Genow
Genow thread history
Target answer / comment by test user
Positive / negative
Comment Cluster
Cluster 1
Cluster 2
The user request sent to Genow that was evaluated
Genow’s answer to the graded question
History of previous communication before the graded response
Comment by the evaluator; contains information about what answer would have been expected or whether information is missing
The clusters created and the frequency with which they occur should be placed in an overall context with the total number of feedbacks, positive and negative feedbacks and the total number of requests. The summarized evaluation can be shown in the following table:
Evaluation
Amount
Takeaway
Optimisation steps
Total number of requests (see Analytics Dashboard in the Admin Panel)
I.e. used more or less than expected
I.e.: motivate user to increase usage
Positive Feedbacks
I.e. comparison to total number of requests
I.e: Go live or next project phase possible?
Negative Feedback
I.e. comparison to total number of requests
I.e.: Optimisation necessary?
Responsibility Use Case
Sum of clusters that can be solved internally by the use case (e.g. updating information). Indicator of how much work still needs to be done internally.
Responsibility Genow
Sum of Clusters that will be solved by Genow (i.e. Bugs)
Optimisation necessery
Clusters that can be improved through optimizations in the data pipeline or through updates and features.
In consultation with Genow - depending on the individual case. E.g. glossaries, fallback strategy, individual base prompt, metadata, …
Cluster 1
Cluster 2…
The table can be supplemented with any other relevant clusters.For the sake of clarity, a graphical representation is recommended at the end.