Some of this will be application dependent, for example in Zoom, the user can select Speaker Focus or Auto Framing and this will alter what's being broadcast from the room.
Much of the experience will be driven by our AI algorithms such as, how long after someone starts talking does Sight frame them, or what happens if no one in the room is speaking.