According to the specification
5.2.3 SpeechSynthesisUtterance Attributes text attribute This attribute specifies the text to be synthesized and spoken for this utterance. This may be either plain text or a complete, well-formed SSML document.
It is not clear how an entire SSML document is expected to be parsed when set at .text property at single instance of SpeechSynthesisUtterance().