Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Domain model of selections, text and code for #2 #5

Open
jdittrich opened this issue Nov 28, 2024 · 3 comments
Open

Domain model of selections, text and code for #2 #5

jdittrich opened this issue Nov 28, 2024 · 3 comments

Comments

@jdittrich
Copy link
Collaborator

jdittrich commented Nov 28, 2024

First core domain entities and values that can be used in #2

Probably:

  • Coding (or "highlight/text section") as entity
    • values: uid, Start, End, related Codes
  • Code (or "tag", only mock, this is not the focus)
    • values: uid, name
  • Transcript (or "source", "text")
    • This might be one of the trickier things to represent, since it can have formatting. Maybe we just take quill’s representation for now?
    • Often represented as subsequent sections of formatting
    • The GoF Design Patterns has an editor as case study.
    • Needed interface to the other domain specific entities might be very limited though, codings should be the same no matter if text is bold or italic: Maybe indices suffice; the view(viewmodel) needs to deal with the rest.
@jdittrich
Copy link
Collaborator Author

jdittrich commented Nov 29, 2024

UML diagram; text description below

  • Use interfaces to abstract
    • the ability to assign codings to some parts of a document, i.e. can-have-codings interface
    • the ability of codings to have codes assigned, i.e. having a list of codes
  • Have the following classes:
    • TextDocument implementing the can-have-codings interface
    • TextCoding implementing the can-have-codes interface
    • Code, being assignable to anything with the can-have-codes interface

This is rather simple, as well as open to expansion e.g. for a more complete coverage of the full REFI model

Text document and Text coding (or any document and their coding – imagine an ImageDocument having an ImageCoding) are closely coupled and know about each other (so the coding can e.g. check its own validity of being within the range of text of the document etc.)

The text document itself does currently not deal with text formatting – for applying codings we only need text boundaries (text from/to) no matter how the text itself is formatted. It might make sense to introduce a "TextContent"-Interface that just provides a "getRange(from,to)" and can, in addition hold any formatting.

In domain driven design terms, the documents would be aggregate roots that contain their codings.

@jankapunkt
Copy link
Member

Hey @jdittrich thanks for the input. Let me comment while reading through:

  • I like the abstraction through interfaces, because we will deal in the future with multi-modal content
  • I agree the TextDocument being the aggregate root but it should only represent but not act on data (it should not manage its own lifecycle)
    • I personally see the acting as the role of the editor or a bridge between the editor functionality and whatever abstraction will exist to build relation between the source, the codes and the selections
  • since we stick to REFI terminology, we should use "selection" instead of "coding" and "source" instead of "document"
  • I currently model the relation between codes and selections as 1..1
    • so there is always exactly one code represented by a selection
    • multiple codes occur automatically through overlapping selections
    • I don't see the usecase within the domain of QDA to have one selection with the exact start/end boundaries linking to multiple codes but maybe you do?
  • I also see formatting as a separate entity here but closer related to codes than selections or editor or text source
    • codes can "demand" formatting rules (such as color), but sources do not necessarily need to "obey" (text might apply the color, but Images or Audio may not and might apply other formatting rules, related to the code)

@jdittrich
Copy link
Collaborator Author

jdittrich commented Dec 5, 2024

should use "selection" instead of "coding"

Yes… what I called "coding" collapsed REFIs "coding" and "selection". REFI, it seems to have separate "selection" and "coding" abstractions, where the "selection" is document specific (like "audio selection") and the "coding" seems to be the thing that codes are applied to.

only represent but not act on data (it should not manage its own lifecycle)…
…acting as the role of the editor or a bridge between the editor functionality

I think that the source/selection aggregate needs functionality to change the source and manage itself – this would allow it to ensure its own integrity (Otherwise, I think, it would create an anemic domain model). The document would not need to know how users change e.g. the text itself – a bridge between editor and document could call the respective methods or send commands. (I think here might be a conflict between what vue.js seems to expect: potentially rich interfaces that operate on JSON data – and classic OOP with its assumption of ensuring integrity by having data in objects?)

so there is always exactly one code represented by a selection
one selection with the exact start/end boundaries linking to multiple codes but maybe you do?

Yes! If I have a selection in an interview that is "I always struggle here – I hate this computer", I can create a selection on " I hate this computer" and assign the codes "anger" and "technology" to it. This "assigning multiple codes to a selection" is something that seems to be very common, since any part of the text can have multiple meanings on different levels. (Now the data could still assume these to be a 1:1 match on selections, but that would deviate from the users’ assumption of "this is a part of the text that I coded with 3 different codes".

formatting as a separate entity here but closer related to codes than selections or editor or text source

I did not model text formatting yet (with this I mean stuff like the text document itself, as opened in MS word or so, having passages that are bold, italic or green). I first thought of it but then did not: For the selection, coding or code, it should not matter whether the text is green or bold
It matters for how it is presented to the user, though, so what I want to think about, though, is what abstraction makes sense to build a view-model that is, what the user actually sees. This would probably be derived from the formatted text document, the selections and the codes. For OpenQDA it might depend first on what quill wants (My inital prototype had a view model that chopped the text into sections whenever both formatting or selection changed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants