VRML Behaviour - a proposal
100% hand-written HTML with hand-painted images. Probably looks
best in Netscape, but other browsers should work too...
| This should no longer be considered as a "proposal for behavior in VRML". This process has gone a
slightly different way nowdays, and I have not had time to update this. However, what is still interesting here,
is my ideas about multiuser spaces, about "engines" and "brains", a duality that is necessary for making
a shared virtual universe real.....
|
| I believe this message to be truly interesting and to actually contain useful insights.
I apologize in advance for it's size, but I couldn't make it much smaller....
|
This is very much an evolving document. The latest version of this document is available
at http://www.lysator.liu.se/~zap/vr_prop1.html.
The version you are reading now was last modified September 1997 and has been accessed by
2439 people since November 2019.
Introduction
After reading the documents on Distributed
VR (http://sunee.uwaterloo.ca/~broehl/distrib.html) and
Behaviours (http://sunee.uwaterloo.ca/~broehl/behav.html) by
Bernie Roehl, I spent some time thinking. (Those of you who have not yet read these
documents should do that now.)
- Actually, I did the thinking while completing our harvest of Oats.
But that is beside the point.
Working with farming machinery is a great thinking opportunity....
Bernie does a whole lot of Very Intelligent Observations in his documents.
I would like to extend his thoughts a little, and add my own thoughts to them.
Typography
When I reference some text in Bernies documents I use the format
"<document>:<title>", e.g. "distrib.html:About Time".
"Levels" of Behaviour
Bernie defines four levels of behaviour ("behav.html:Levels of Behaviour"), naturally
(because he is a computer nerd, like all of us) they are numbered 0 to 3.
- Level 0
- Directly modifying an object "set position to 10,10,10"
- Level 1
- Simple(?) deterministic behaviour: "move in this direction 30 centimeter per second"
- Level 2
- More complex behaviours "Search for food", "Find girlfriend", "Do her" e.t.c.
- Level 3
- Decitionmaking. Decide which Level 2 behaviour to do. "Should I find a girl, or go eat?"
The big dividing line is between levels 1 and 2, because a level >2 behavior "knows" about
the "world". A level 1 behaviour is simply an initial position, and parameters.
Therefore I suggest the following division: The Level 0/1 behaviours are the "engines"
(to borrow a term from Open Inventor). Level 2/3 behaviours are the "brain".
In this text I will mostly talk about the distinction between "engine" and "brain".
Engines, DIS and "Dead reckoning"
My concept of engines are like DIS ("distrib.html:The basic ideas of DIS") "dead reckoning",
only a lot more powerful. An "engine" is, in my view, a behaviour script, who's only external
parameter is time.
An engine script doesn't respond to outside events. (The brain does.) It only responds to
time, and it's initial parameters. The brain provides these initial parameters, and may
modify them, or replace the engine script completely.
The engine script is called each frame by the renderer, to query the position of the
objects in the scene. The script should avoid storing information from one invokation
to the next. It should not rely on variables, other than the initial parameters.
An engine must be written in such a way that it can be passed a time T, and the
enginge should know what to do at that time.
However, an engine can be of any complexity. It can actually borrow a
lot from Bernies level 2 behaviours! We could create an engine for "Run to the tree
and climb up". Actually, the more complex the engines are, the less network traffic
will be generated!
Time stamps
An example: The squirrel brain (we'll get to it later) decides, "hey, there is food
over there", it sends out a packet on the net saying, in essence:
To: Object (squirrel#7):
Execute function "WalkTo"
Start time: 1995-10-01 14:31:09.55
Duration: 5 seconds
Parameters:
InitialPos = 30.0, 25.3, 11.5
Goal = 50.0, 30.0, 11.5
This will tell the squirrel engine (loaded thru a WWWScript node) to execute the function "WalkTo".
The parameters "InitialPos" and "Goal" are two input-parameters to the "WalkTo" function.
| The representation would naturally not be text, but much more compact.
E.g. the function name could be an enumerator giving that functions ordinal in
the squirrel engine script. |
This will cause the squirrel to start moving towards 'Goal'. The walking itself could be
arbitrarily complex; Everything from a linear motion - the squirrel 'floats' to the food
- to a highly orchestrated motion where every limb moves delicately, the squirrel stops halfway,
sniffs, blinks, looks behind him, yawns, scratches himself behind the left ear, and then continues,
as long as it all is a predetermined function of time!
Since the packet contains the initial timestamp, a small latency in the net is hardly
noticable.
As soon as the packet arrives at any host running the simulation, the squirrel will be in sync,
because the engine is a function of time, and can easily be "jumped into".
And when the end-time arrives, it will stop.
But what if the squirrel "brain" detects a wolf after two of those five seconds? No problem.
A new packet is sent:
To: Object (squirrel#7):
Execute function "Flee"
Start time: 1995-10-01 14:31:12.04
Duration:
Parameters:
InitialPos = 42.0, 27.5, 11.5
DirectionVector = -12.0, -5.0, 0.0
Now a new "execute script" packet is sent. In this case a new function (Flee) in the same
file is used.
The old engine is overriden with the new one (an engine script which
is "in progress" is overridden by a new engine script with a timestamp within the timespan
of the currently executing script.)
An indefinite "fleeing" behaviour will be initiated.
Hopefully (for the squirrel) the brain will eventually decide to stop fleeing (and get some rest) and send a new
override packet with a stop message...
| The "WalkTo" and "Flee" scripts are in this case in one file. They could alse be two
completely different files. It could also be one script with a
different parameter (WhatToDo="walk" or "flee"). The important thing is that the brain supplies
the correct parameters for a particular script. The brain must know the engines interface.
|
The forseable future
| Here's my little blockbuster.... :-)
|
The "brain" could easily forsee the future of a simple deterministic
behaviour. A ball dropping to the floor will bounce. A vase will shatter. Therefore the brain
is allowed to post FUTURE messages with a timestamp of the FUTURE.
Allowing looks into the forseable future, can inprove remote responsivness greatly.
The vase will shatter and the ball will bounce at the same instant for all viewers.
Of course the bounce or the shatter could have been built into a complex fall-and-bounce or
fall-and-shatter "engine" behaviours. As said before, these can be arbitrarily complex. But
this is just an example.
In a complex case, the brain could forsee that object A and object B will collide in three seconds,
and post a forsight message about the resulting bounce.
This means that for each virtual object, there is a sort of "message queue", which can store
these "future time-stamped" events, and sequentially execute them as their respective start-time arrives.
Even in the squirrel case, these future messages can be used without sacrificing realism.
Lets say the brain detects the wolf. So the brain posts a packet saying "Start fleeing in
2 seconds". This gives us the following "features":
- The squirrel gets a pretty natural "reaction time", just like a real squirrel.
- Even if the wolf walks out-of-range immediately, the squirrel will still run away,
also, just like a real squirrel.
- And most importantly, all squirrels in all simulations are completely in sync. There
is no lag whatsoever.
Naturally, when posting into the "future", we must be allowed to change our minds, if we want
to. If, for instance, the wolf runs past the squirrel and out of it's "scare me" range, the
squirrel brain might consider not fleeing.
If it does this, it posts another future packet, that has an earlier timestamp than the "Flee"
packet, plus an override flag. This flag essentially means "when a packet of this type
arrives, flush any events with later times from the queue.
So, where is the brain? As Bernie notes ("behav.html:Why not level 2?"), the brain must run
in exactly one place. I couldn't agree more.
But, In my opinion, there are two kinds of "brains":
- Scripted brains
- These are the brains which can be expressed as a simple program. A typical scripted brain is
the brain of a vase. The vase brain responds to you grabbing it (ends up in your hand).
You may shake it around, and throw it. If the velocity is high enough, it will shatter.
The scripted brain is fully described by it's program.
- "Special" brains
- These are brains which do the complex stuff. This includes YOU. (You are the "brain" of
your avatar). The special brains may even have external sensors connected to them
(a SFO bay area fog detector, or whatever).
Obviously physical location (i.e. the IP adress) of a "special" brain is important.
The "special" brain exists somewhere on the net. It is sends and receives messages, and
governs the object.
The scripted brains, on the other hand, can reside anywhere! And, IMHO, the scripted
brains should run in the host machine of the first user who causes that object to be loaded.
If this user loggs off while others are still "there", a handover of the "brain" must be done
to somebody else. Another important feature is
that the "brain" can also decide for itself to "become owned" by another user.
Why do I want this to happen, you may ask?
Well, I am thinking of two things:
- Single-user scenario
- If "brains" must be run in a special "brainRunner" deamon somewhere, it is cumbersome
for the poor single-user person who want's to play around with his VRML world locally, WITHOUT
even beein on the net! This must be possible, and easy to do. If objects by default run in the same
host as the first user who loads them, this will be very simple because he is the ONLY user!
- Responsiveness
- If I go into a room and see a vase on a shelf, and I grab it, only to notice it responds stupidly,
because the "brain" for this object runs on a machine with a 300ms netlag from my site, it ain't fun.
This is why I want scripted brains to be able to "float around". In the "pick up" operation of a
vase, the vase "brain" may decide to move to MY machine. So I, who is actually interacting with
the vase right now, get the best responsetimes.
Objects and Brains
This is how I envision an object inclusion in VRML:
Def Squirrel {
WWWInline geometry "http://www.xyz/~zap/squirrel.vrml"
WWWScript engine "http://www.xyz/~zap/squirreng.vrbl#SitNWait"
WWWScript brain "http://www.xyz/~zap/squirrel.vrbl"
}
In the above example we inline the geometry in the file "squirrel.vrml". The first engine
to be applied to the squirrel is in squirreng.vrbl, the function "SitNWait", with the
default parameters for that engine.
(Loading no engine would be legal, and probably the normal thing to do).
The brain is a scripted brain: squirrel.vrbl. In this case it is loaded on the same spot as the
engine. This automatically gives the brain the same scope as the engine.
The brain could have been loaded on a higher level.
|
For language choices, I would suggest ATLAST
(a Forth-like language). It has the advantages in simplicity, and being in the Public Domain.
|
For a special brain, something of the following:
Def MyAvatar {
WWWInline geometry "http://www.xyz/~zap/me.vrml"
WWWScript engine "http://www.xyz/~zap/me.vrbl"
WWWScript brainlink "http://www.xyz:9876/~zap/"
}
This syntax means that the brain isn't in a file. The brain is already there,
running, out on the net. The port is where the messages go, and where they
come from. This is a special brain (because it is my brain!!) so it can't be
handed over. This object wouldn't work if I wasn't connected to the net (or
at least could reach www.xyz in this case).
Allowed modifications
Engines are allowed to modify the object they are loaded for, and any child objects
of that object. I.e. the squirrel engines can move the squirrel itself, the tail,
the legs and the nose, (the tail could also have it's own child behaviour, if needed)
but it is not allowed to modify the tree beside it. Simlarly, the brain can directly
invoke engines in the squirrel, it's tail, or nose, but not in the tree. (It can,
however, send a message to the tree).
The VRML specification specifically states that a browser isn't required to keep the scene
graph model in memory. However, the key issue when connecting the engine to the geometry, is
that the scene graph must be retained, but only for the modifyable objects.
Here is a proposal for how it might look:
DEF Squirrel DYNAMIC Separator {
# Load squirrel behaviour script
WWWScript engine "http://www.lysator.liu.se/~zap/squirrel.vrbl"
WWWScript brain "http://www.lysator.liu.se/~zap/squirrel.vrbl"
DYNAMIC Transform "pos" {
translation 0 0 0 # Position the squirrel
}
# Squirrel body
Cube {
width 10
height 20
depth 10
}
DYNAMIC Separator "rgt_thigh" {
DYNAMIC Transform "pos" {
translation . . .
rotation . . .
}
# Squirrel's RIGHT thigh
Cube {
width 2
height 2
depth 15
}
DYNAMIC Separator "ankle" {
DYNAMIC Transform "pos" {
translation . . .
rotation . . .
}
# Squirrel's ankle
Cube {
width 2
height 2
depth 15
}
}
}
DYNAMIC Separator "left_thigh" {
DYNAMIC Transform "pos" {
translation . . .
rotation . . .
}
# Squirrel's LEFT thigh
Cube {
width 2
height 2
depth 15
}
DYNAMIC Separator "ankle" {
DYNAMIC Transform "pos" {
translation . . .
rotation . . .
}
# Squirrel's ankle
Cube {
width 2
height 2
depth 15
}
}
}
DYNAMIC Separator "head" {
DYNAMIC Transform "pos" {
translation . . .
}
DYNAMIC Material "blush" {
diffusecolor 0.6 0.4 0.2
}
DYNAMIC Coordinate3 "mesh" {
point [-2 0 -2, -2 0 2, 2 0 2, . . . ]
}
IndexedFaceSet {
coordIndex [ 0, 1, 2, . . . ]
}
}
}
The above changes have the following meaning:
Reusability of behaviour
We must allow some kind of "attributes" to be attached at any level of the object too. These
can have myriads of uses, and I would suggest a simple variable=value format straight
in the .vrml file.
If, for instance, we set attributes defining the leg-length and allowed turning-angles for
joints in the squirrel, we could use a standard "walk" model for walking without having to
write our own squirrel-walk. We could just pick one off the shelf, adopt is't naming scheme,
set up the attributes to make the "walk" behaviour behave, and off it goes.
By this method, lots of people will start writing useful behaviours. With the correct
division between attributes
(defined in the model) and their use (in the engines), reusability is ensured.
What about ZOI's and such?
We want this all to scale, and to scale it we need some zoning, and similar. But consider
this:
Today most worlds are small, because of the rendering power of the contemporary
machines. A single .vrml file wouldn't fit much more than five people before the RENDERING
is bogged down anyway, so I suggest:
For now, we igonre this issue. (Using the discussion below, doing so now is sort of "safe"). Why?
Well, if the current shape of vrml worlds will set the standard, we will not be building "huge,
seamless worlds". We are building little islands-with-links. Therefore, IMHO, the
ZOI == the top level vrml world.
| Of course I would love it if we could add WWWAnchors for user position and
direction-of-motion. That way we could walk-thru-doors and flu-thru-windows instead of this unintuive
"click on the door to go elswhere" of today.
|
Network Protocol
I have also spent some time thinking about network protocol that efficiently adresses the
needs to implement the above. Here are some of the basic ideas I came up with:
Minimizing info
Most lengthy stuff could be given an identifier. An URL to something, could be replaced
by the IP number and an identifier. If somebody (newly logged on, I would imagine) got the
identifier before knowing what it ment, this host could send out a "what the hell
does X mean?" message, and get the explanation from someone who knows.
Very Complex "Engines" reduce net traffic: If the "walking" of a person is accomplished
by a spline interpolated table of samples of real walking people's legjoint rotation
patterns, and the "walking" behaviour includes such complex "parameters" as "walk to this
spot following a spline with these control points at these points in time", no net traffic
needs to happen during the entire, highly realistic walk! Compare that to transferring the
angle of each joint each frame.......
Each object created in the world would receive it's own instance identifier (e.g. IP adress
of creator + index no). This is used througout to reference this object.
Do we need a server?
Other than the normal http server providing the files, there is no definite hardcoded 100%
need for any special server software. Without a server, the following scenario is
possible:
Charlie and Art decides to cyber a little. Charlie launches his browser, and loads some world.
Art loads his browser, but doesn't load a WORLD. He simply says "connect to charlie". What
happens now is:
- Art's computer sends a "what's happni'n dude?" message to Charlies computer.
- Charlies machine notes that a new user has appeared. It adds Art's IP adress to a list
(which up till now only contained himself)
- Charlies machine answers by revealing:
- The world name
- Any objects loaded (that are not in the world description)
- The currently running engines and their parameters for all objects
- A list of the brains already running
- A list of IP adresses where to send messages. (In this case, only the address of
Charlies and Art's machines)
- Art's machine loads all this, sets up all engines.
- While this is happening, Charlie is moving some furniture. He sends messages to all IP's
in the list except himself.
- Art's machine receives the updates, changes the engines accordingly. When everything is
loaded (maybe even while it is loading), Art can see the furniture moving.
- Art's machine sends a "Create Object" package. The package contains:
- The URL to the .vrml file of Art's avatar
- The URL to the .vrbl file with Art's "engines"
- The URL to art's "brain" (= his mouse/joystick/powerglove)
- Charlie's machine receives this, and updates his picture accordingly
- Art goes to the table who Charlie just moved. The brain for this object is now running
in Charlies machine. Art grabs the table. By doing this, the brain transfers itself to
Art's machine, to improve responsiveness. Art shakes the table violently.
He grins to himself because of the nice rapid responses.
- Charlie looks puzzled and ponders the waving table.
- Then suddenly Jamie connects to Art's machine. He sends the "what's happni'n dude's" message.
- Art's machine informs Jamie's about the world, the objects, their state. He also sends
out a new list, now of three IP adresses, where all info should go.
- Charlie has had enough, he logs off. That generates a delete-message for his avatar, and
his IP adress is deleted from the list.
- .... e.t.c.
This scenario, is strictly peer-to-peer, and there is NO server involved anywhere (except
for holding the files). As I see it, a server would however be necessary to start the
initial connection. In the example, Charlie and Art had to meed beforehand to decide they
should cyber away. Jamie had to be "in on it" too. A server would then take the place of
the "connection master".
With a bit of though, you can see that this approach scales nicely too. Assume there is
a server. This is the same scenario again, now WITH server:
- Charlie loads the VRML file (He is the first to do so). The file contains a line like:
WWWServer {
name = "vrtp://vr.wired.com:9876/this_world"
}
This sever is contacted by the browser. The browser sends the "what's up
dude?" message to the server. The server sends:
- The world name
- A list of IP adresses where to send messages. (In this case, only the address of
the server and Charlie)
- The world appears on Charlies screen. He grabs the table.
- The table brain decides to move the Charlies machine, to improve his performance. The
server gives the brain to Charlie, and charlie moves around the furniture a little.
- Now Art connects to the server. Art gets:
- The world name
- Any objects loaded (Charlies avatar!)
- The currently running engines and their parameters for all objects (some moving furniture)
- A list of the brains already running (the table....)
- A list of IP adresses where to send messages. (Charlie, Art, the server)
- ...e.t.c.
The only difference now, is that the server is "in on the loop". The server is just like
"one of the guys". The difference is, that the server never bothers to render anything,
and that it is always "there" (I.e. always something to connect to). It may occasionaly
need to run a brain, if everybody else logs out (and it isn't setup to reset the world
if that happens, which a game probably would do).
But lets assume 2093 people log in. The server may decide to move to
some broadcasting mode. It sends out a list of IP adresses to send to: Only
the servers IP itself.
So when Art logs in, he is never informed about Charlies IP adress (or
any other of the 2093 people there). He never directly sends packets
to Art. He sends them to everybody-but-himself on the list (which is,
only the server).
The server then starts to broadcast the stuff instead......
Similarly, the server could pass a multicast adress as the one to be used...
- To summarize:
- In my view, the server doesn't do much more than maintain the logins/outs.
In a future, scalable scenario, the server would be responsible for ZOI testing,
and seeing WHAT to send to WHOM, instead of sending EVERYTHING to EVERYBODY withing
the same .vrml file. (But I also think, that the vrml file itself can BE the ZOI
for the forseable future, due to rendering limitations.) The server also decides
when and if to switch to multicasting.
The important thing is, that the peer to peer approach is, protocol wise, identical to
the servered and multicast approach. Meaning, me and my pal could test something out
between ourselves. THEN we could set up a server. THEN we could make the server intelligen.
THEN we could go to virtual lunch together :-)
The different interfaces
It can be complicated to keep track of which data needs to be defined in a network-transmittable
manner, and which does not. There are many interfaces that needs to be defined. I believe that
some must be defined now, others will "evolve".
Here is an overview of the interfaces:
- ETGI - Engine To Geometry Interface
- I call this Interface instead of protocol, because this
never has to pass a network boundary.
- This is the only interface the people advocating for the "API" approach wants to define.
But the API approach only moves the problem, and does not solve it.
(But it keeps VRML itself leaner, which is a valid argument.)
- This is where we need to define what modifications an engine can do on geometry.
My suggestions are listed above.
- Engines can query the geometrys current matrix, e.t.c. per above
- Engines can query the geometrys "attributes".
- BTEP - Brain To Engine Protocol
- This is a Protocol because this has to pass a network boundary.
- It is the universal way in which the brain informs the engine of what to do within
a certain timespan.
- The protocol is basically a function call, stored in a network readable format.
There are many ways this can be done very simple. It needs to include:
- The timestamp, in some universal time format.
- The duration of the event, or zero for infinite.
- The function name (or an ordinal number, for compactness)
- An option to contain the full URL of a different engine script
to replace the old one.
- The override flag.
- The number of parameters passed.
- The parameters for the function, in order. Any left out parameters assume default values. So the "most commonly used" parameters go first in a function call, then less and less common one follows.
- These will probably be sent as UDP packets. There is no return value.
Packets sent to the engine that runs on the same host as the brain are not sent over the
network. They are directly put in the message queue of the local engine.
The engine should also support a few standard return functions. These are called
directly by the brain to the local copy of the engine. These should include things like:
- WhereAreYou
- Query the objects current position. (Returns the transformation matrix).
- GetBS
- Return the bounding sphere in world coordinates (very approximate)
- GetBB
- Return the bounding box in world coordinates (very approximate)
- Status
- Returns status info about the engine.
These functions are standardized because they are available for any brain.
E.g. when the Wolf wants to know where the squirrel is, the wolfs brain can query
the local copy of the squirrel directly.
| Naturally it could ask the squirrels brain,
which would give a more accurate position of the squirrel, but that information would
be subject to lag. Asking the local squirrel might be incorrect (because of a pending
update packet subject to lag) but doesn't have any lag at all. It is up to the
application to decide which method to use for position testing. For simple proximity
testing, asking the local copy of an object is encouraged, since it saves bandwidth. |
- BTBP - Brain To Brain Protocol
- This is also a Protocol because this has to pass a network boundary.
- This is the way brains communicate, send and recieve messages, and soforth.
If is essentially a trans-network function call.
- The protocol is very similar to the BTEP. Both formats would be founded on a
common base. The call must include:
- The function name
- Parameter count
- Parameter values
- A return packet with the return value is expected. When a brain calls another
brain, the calling brain's thread is held in wait for the response from the
called brain. (We probably need a timeout here.)
- Other - Special interfaces of application specific nature
- These are special application specific interfaces between a "special" brain and any
input devices.
Summary of Document
This concludes my thoughts for this version of this document. I think that the highlights
of this document can be summarized as:
- Two types of behaviour: "Engine" and "Brain". "Engine" runs everywhere, "Brain" runs at one spot.
- "Engine" and "Brain" loaded separatly for an object
- "Brain" posts timestamped invokations of "Engines"
- Invokations in the FUTURE is allowed
- Two types of "Brain"
- Scripted brains can move around at will.
- Engines modify object and children
- We can ignore ZOI's for now (it's a server issue anyway)
- Server only needs to handle logins/logouts
I hope this wasn't completely unreadable.
Comments are welcome to me
or the vrml mailing list.
Page accessed 2439 times since November 2019 and was last modified September 2006