Notes
Various notes from XDS2007 presentations. Please fill this out with links to slides, project pages, etc.
Day One
John Bridgman, Matthew Tippet - AMD/ATI
- NDA-less 2D specs coming soon, 3D specs coming later, r300 3D specs coming even later
- Dedicated developer support team too
- Modesetting to use ATOM BIOS for some things: set of data tables, and interpreted code, which means you can do native POST on any architecture. Windows driver and BIOS already using this, so it's very capable (not another VESA).
- Legal issues regarding HDCP and DRM video decoding, which means those parts become harder.
- AMD using the doc release process to improve internal documentation quality too
Dave Airlie - Red Hat X.Org work
- Fedora 9 in May/June of 2008
- Composite needs to be on by default. What needs to be fixed for that?
- GLX needs to obey window redirection
- Likewise for Xv
- 2D acceleration should work, hardware accel all the way through rendering
- Need a proper memory manager to tie all these things together
- krh has been working on redirected GL
- airlied working on getting TTM mergeable into the kernel
- cworth been optimising intel EXA and benchmarking
- Smooth GUI booting: no mode switches and annoying flashes.
- Requires kernel modesetting, which requires TTM.
- X thus requires DRM, which hurts the BSDs and Solaris, which don't really have it everywhere.
- Life's tough, can't hold back development for OSes which lack development resources.
- Some Linux developers want the DRM to become a Linux-only project and not cross-platform anymore, but airlied and others have been admirably resisting this.
Stuart Kreitman - Sun Desktop Update
- EOL of Xsun! Solaris-wide commitment to Xorg
- Transition complete for x86, in process for SPARC and SunRay
- S10_U4/SXDE3 are on Xorg 7.2. S10_U5/SXDE4 are Xorg 7.3.
- Features TSOL, Composite, RANDR 1.2, wide hardware support
- Primary support for ATI, Intel, and NVIDIA
- SPARC hardware projects: ffb, elite3d, PGX64, XVR100, PGX32
- Xorg has a DTrace provider now
- Belenix: Merged osol and Xorg packaging
- Martux: SPARC distro, includes the various SPARC drivers
- FOX project: OSOL project to integrate X efforts among OSOL distros
- http://www.opensolaris.org/os/community/desktop/
Alex Deucher - driver/xf86-video-ati, yesterday, today, and tomorrow
- Yesterday: Hardcoded crtc to output mappings
- 2 output limit, randr-like functionality through mergedfb
- Today: full randr 1.2 support, all outputs supported, initial mac support
- Tomorrow stuff!
- RMX, the panel scaler. Off, full, center, aspect.
- In principle can be used on any output, currently only used on LCDs.
- DVO, external connections for outputs. DAC, TMDS, etc.
- Needed for external TMDS. x86 legacy bios has external TMDS table.
- Non-VBE posting. Code for legacy BIOS that doesn't quite work.
- ATOM BIOS should get this for free with the new parser.
- ATOM parser should get folded into radeon too.
- Planning to merge r128 into radeon.
- Probably want to split mach64 out and drop the old ATI wrapper
- Composite transforms need fixing on R100 and R200 so rotation works
- More TV-out modes, fix PAL, random bugfixes
- Day after tomorrow
- TTM. Needs DRM support. airlied did some experimental work.
- R300/R400 Composite. Render accel and rotation support.
- PCI rework, kernel modesetting.
Zack Rusin - Accelerating Desktops
- Render has good text handling, but is quirky and complicated
- So what gets used? Composition, rasterisation, transforms and gradients
- What is easy to accelerate? Composition, transform, gradients. Rasterisation is a problem.
- EXA doesn't require DRI, is self-contained, basic composition is easy
- EXA syncs. And syncs. And shares state with 3D engine.
- Using the 3D engine from the 2D driver requires writing the code twice.
- (long technical discussion about where to put code)
- Things that still aren't accelerated: Tesselation, Image effects, Curve decomposition
- (lots of head nodding and general planning)
X.Org Board Update
- Trying to convert from LLC to 501(c)3
- Also trying to divest a cash backlog, since we should be not-for-profit
- Most of the current cash reserve is from heldover consortium fees
- Still need other ideas for things to do! Ask!
- We are now VESA members.
- May join Khronos soon for GL stuff.
- Developers, please sign up for Foundation membership!
Zhenyu Wang - XvMC and stuff
Keith Packard - Intel Status Update
- 2D driver for i810 through present
- 3D driver splits at major architectural changes: 830, 915, 965
- Include old hardware support for new features
- Test environment includes at least one of each chipset
- Ship new features when ready
- BIOS-free modesetting, hotplug monitors, rotation, other randr1.2 hotness
- TV-out supported on all mobile chips
- Current driver: OpenGL 1.5, overlays on hardware that has it, textured video on everything else
- Next driver: OpenGL 2.1, HW MPEG decode, output scaling, HDMI, power savings
- DFGT, DRRS, DPST, D2PO. Magic acronyms with no definition.
- OpenGL performance measurement and tuning planned
- Future media: MPEG iDCT, VLD, H.264, VC-1, de-interlace, etc.
- Output support: kernel modesetting, eliminate POST on S3 resume
- Moving to TTM and new memory manager, lots of optimisation there
- GLSL with minor issues. TTM mostly working. Shipping GL 2.1 by January.
- http://cworth.org/tag/exa/
- Cairo adoption: gnome (librsvg, poppler, evince), mozilla, webkit-gtk
- Stack: app, cairo, Render, EXA
- Benchmarks! Wouldn't those be nice
- cairo-perf: git://git.cairographics.org/git/cairo, make perf. synthetic, micro
- x11perf. synthetic, micro, very poor Render coverage
- mozilla trender. http://cworth.org/trender_bookmark/. real, macro.
- Results. EXA does blit and solid fill really really well.
- EXA is a slowdown for trender, which is primarily a glyphs benchmark.
- i965 problems. hidden RMW cycles, synchronous compositing, pinned glyphs.
- Improvements. Fixed RMW cycles.
- Posted patches to store glyphs as pixmaps and eliminate some fallbacks
- Idea for "PolyComposite" hook.
- Future work: memory management, fallback elimination, transforms
- Want to do trapezoid/polygon rasterisation and gradients in hardware too
- Performance is really bad. How can I help?
- Gamers want high performance, aren't finding it in OSS drivers
- R100 was released in 2000, didn't get HyperZ until 2004
- gears is ~twice as slow on linux than windows for 945GM
- (sysprof profile, technical discussion about implementation details)
- Need better documentation for GPU instruction sets
- Ideas
- Cross-platform performance and conformance test suite
- Test farm with all supported hardware for regression testing
- Driver developers should work from beginning of product development
Day Two
Zou Nan Hai - GLSL
- GLSL is the OpenGL shading language, a feature of GL 2.0
- There are other high level shading language: Cg, HLSL
- Supports vertex and fragment programs with the same syntax
- Provides C like language: function calls, onditionals, branches
- Some builtins and keywords are different between vertex and fragment shaders
- Use "varying" variables to pass values between vertex and fragment stages
- (example programs)
- Mesa includes a GLSL frontend, IR, and software backend
- Intel working on 965 backend based on existing program emit code
- 965-glsl branch in Mesa, many demos are working
- Not fully tested yet, some IR ops not implemented yet, error report needs work
- Ideas about optimization (register allocation, various builtins)
- Future ideas: debugger, profiler, wider adoption
Keith Whitwell - Gallium, softpipe, Cell, and beyond
- Historically we had a complete core and very compact drivers
- This has been changing: more complex drivers, harder bringup
- Want to fix this. Driver model is no longer correct
- Impose new interfaces and objects
- Stack: Mesa, state tracker, Gallium hardware driver, os/window layer, dri/drm
- New model inspired by GL3, NV_GPU4, 965, etc.
- constant state objects, simple drawing interface
- unified shading language as bytecode, private buffers as reder targets
- ability to re-target hw drivers to new APIs, window systems, and OSes
- interface: create/bind/delete state, draw, buffer management, fencing, flush
- state tracker reduces GL to this small interface, deals with GL quirks
- winsys layer does two interfaces: glx/dri and hw/drm
- glx/dri is the window system, cliprects, swapbuffers, etc.
- hw/drm is the kernel, command buffer dispatch, etc
- reference driver is "softpipe", the reference/sample software renderer
- looks very much like a GPU pipeline written for a CPU
- also a proof of concept driver for i915
- two winsys backends: xlib and dri
- Mesa-GL3 basically talks to gallium drivers directly
- eliminates most of the state tracker complexity of GL1/GL2
- Also means you can port the gallium drivers to DX/Vista
- future: softpipe + llvm, pervasive codegen for the whole driver
- should have initial targets of x86 and cell
- failover driver, moves fallback handling out od driver and state tracker
Kristian Høgsberg - Redirected Direct Rendering
- ... and GLX 1.4, and zero-copy tfp, and lazy buffer alloc, and cleanup
- DDX and EXA changes. DDX allocates all buffers as DRM buffer objects
- Shared ancillary buffers allocated dynamically
- EXA calls DDX driver hook to allocate pixmaps as buffer objects
- DRI module changes. Add DDX driver hook to allocate ancillary buffers.
- Adds pixmap private to store ancillary buffers so client sharing works
- This works for all pixmaps, including the screen pixmap
- (technical discussion of senamtics of back buffers)
- Hooks SetWindowPixmap to allocate ancillary buffers when redirection happens
- On buffer reallocation, signal the DRI driver through the SAREA
- From DRI driver perspective, redirecting and resizing are the same operation
- Again, works for the screen pixmap too
- Breaks the DRI API slightly, removing X assumptions
- New extension mechanism for the DRI driver to advertise new functionality
- (example of DRI_COPY_SUB_BUFFER extension)
- Redirected direct rendering falls out for free, because the front buffer moves
- GLXPixmaps are mostly the same as a redirected window now
- GLXPbuffers are also easy to create
- Add support for glXChangeDrawableAttributes for API reasons
- ... but we never clobber them, so you never need to send out the event
- glXGetProcAddress is already done, so with all that, we get GLX 1.4!
- Need to fix visual setup so you can advertise fbconfigs with pbuffer support
- GLX_texture_from_pixmap gets cooler too
- Pixmaps are now buffer objects, so you can bind them to textures directly
- SAREA is now the only user of drmAddMap/drmMap
- Suggestion for SAREA journal and/or deletion, and upgrade path
- (really hot demo)
Lots of people - TTM BoF
- Things to fix for upstreaming: accounting pinned memory, initial memory size
- Is the API finished? Is superioctl correct?
- There's a NO_MOVE flag and a NO_EVICT flag, discussion about why both
- Want to remove hardware lock requirement for creating buffers
- Split buffer creation and buffer validation
- Locking issues surrounding the 'kernel context'
Dan Amelang - jitblt
- Addresses the problem of the complexity of Render compositing operations
- Currently implemented as a big case explosion of unpack/composite/pack
- Plus some special casin for fastpaths
- jitblt replaces that with a dynamic code generator
- implemented in a lispish microlanguage called jolt
- declarative syntax for pixel formts and compositing operators
- optimization passes for constant folding, algebraic identities, and CSE
- pixman goes from 9500 lines to ~750
- faster than pixman for anything in fbCompositeGeneral
- comparable for many common fastpaths. slower for arm and for memcpy.
- still lots of room for optimization
- Currently, the usage model is one computer per person
- This doesn't scale well: cost, energy, resources, maintenance
- How do you share the system without degrading the user experience?
- "nivo": network in, video out. vga+input over ethernet, ultra-thin client.
- Also available as vga+input over usb, available in many products
- Currently driven as a pipe to Xvnc
- Developing an Xorg driver as well for better performance
- not a typical driver: both input and output, no modesetting, not pci
- Working on Xorg 7.x, DVD playback over USB with no optimisation effort
- Need to finish keyboard, rotation, Xv, hotplugging
- Xi, hotplug, MPX, XKB, DDX/DIX status, event refurbishment
- Xi was developed for "one mouse for normal UI, plus a spaceball"
- Basically only used in the gimp.
- Exposes a device with an arbitrary number of buttons, axes, keys.
- You walk the device list, open your device, and get exclusive event delivery
- Horrible event model though. The better model would be core + device id.
- Once you do that, hotplug is pretty easy.
- MPX is multiple pointers with multiple cursors. Which means multiple foci.
- "Multiple grabs mean head-exploding issues."
- XKB is wretched. Exposes bad binary format over the wire.
- API was defined in terms of (bad) implementation.
- New API that matches use was patched on as badly as possible.
- But, it's very necessary. So how do you fix it?
- Delete the old API, everything is rules-model-layout-variant-options
- Fold xkbcomp into the server to speed up startup and make it work
- Far too much stuff was handled in the DDX, identically among all of them
- input hotplug moves as much as possible to the DIX layer
- Get*Events return a pile of wire protocol xEvents
- Requires much back-inference to actually send the events
- Needs a rework, there is a plan
- Remaining issues
- High keycodes - can't solve without a protocol bump
- More than four layouts - either a ridiculous server hack, or a protocol bump
- Multiple pointers - almost done!
- Code quality is wretched, needs to be redone for bugs' sake
- Input hotplug basically not adopted yet
Peter Hutterer - MPX
- MPX basics
- Currently, there's only one pointer and keyboard; how lame
- MPX splits that into a cursor per device
- core pointer and keyboard are now virtual, only for compatibility
- multiple devices in any client, multiple clients at a time
- nothing changed if you only have one set of devices
- (demo)
- MPX quirks
- Enter/Leave and FocusIn/FocusOut changed.
- First pointer in sends an Enter event, last one out sends a Leave
- Same for Focus; first keyboard in sends a FocusIn, last one out sends FocusOut
- Keyboard has to be paired with a pointer
- Pairing rules
- Pair keyboard with first unpaired pointer
- Else, pait keyboard with first pointer
- Else if no real pointer, pair with the virtual core pointer
- API for changing the default rules
- For ambiguous requests, the client is assigned a "client pointer"
- Grabs: formerly, core devices with core grabs, single device with Xi grab
- Now, grabs act on a single device, so multiple clients can grab
- For core events, grabs mean the whole UI is bound to just that device
- Core grab means 1:1 mapping between device and client
- Xi device grab means device bound to client but client can talk to many devices
- Passive grabs are owned by the device that activates the grab
- Remaining issue: N devices per cursor and grab semantics
- XGE
- The core event spec is painfully constrained. 32 bytes per event, 64 events.
- Generic events fix this, more space per event, more event types
- Server can't send a long event until the client says it knows about them
- MPX uses this for several things, required for multitouch
- MPX multitouch
- Mouse is fine for point events, but some touchscreens send mutiple blobs
- New XBlobEvent that just shoves data to the clients
- 282 lines for all this plus pointer emulation on the hotspot
Day Three
Matthew Garrett - ACPI
- ACPI started as a replacement for APM, MP-BIOS, interrupt routing, e820
- Interface between the firmware and the OS, bonghits, misc, other
- Often extended in undocumented, inconsistent ways by vendors
- ACPI 2.0 added an optional video extension; now required by Vista
- Gives you display switching, brightness control, EDID, POST device
- _DOS method (display output switch) switches displays and gives you an event
- Sometimes you can suppress the system from doing this and handle it yourself
- Brightness control gives you events on change, and a get/set interface
- But the platform may change brightness anyway without telling you
- Display enumeration gives you CRTCs and physical connectors
- Might tell you connection status, if you're lucky
- Will tell you whether the port is currently present (or on a dock)
- ROM method if you don't have a PCI BAR, except it's not the ROM
- Can query the boot adaptor, and can set the boot adaptor
- EDID method mandatory if it's not available some other way
- Impossible to map the ACPI video ID to a PCI slot
- OpRegions (unpublished spec) provides a backchannel for communication
- ACPI doesn't require anything of the gfx hardware over suspend
- So we try to POST and hope, and sometimes that works
- Or you can do VBE save and restore state, and sometimes that works
- Or you can do pre-VBE set text mode, and sometimes that works
- Mostly it just doesn't
Dave Airlie, Jesse Barnes, Jakob Bornecrantz - Kernel Modesetting BoF
- API looks pretty much like the RANDR protocol now
- Add a DRM control file for persistent state (since we can't with current DRM)
- Use privileged switcher for fast user switching
- Requires some enhancements to the resume path to really work nicely
- Memory eviction on suspend is hard
Adam Jackson - Monitors and why they hate you
I didn't take notes on my own talk. Anybody?
Egbert Eich, Luc Verhaegen - ATI Driver Status
- When ATI dropped support for the open driver, distros lives got hard
- SuSE has a end-user installer, but that's still not perfect
- Started discussions about how to enable an open source driver again
- ATI approached SuSE to develop a driver, working on it for the last ~6 weeks
- Mostly working, still some output setup buglets
- (Details about some bringup issues)
- First code drop planed on Monday
- Docs approved for release without NDA about 45 seconds ago
Zack Rusin - LLVM for Mesa
- LLVM has two primary components: optimizer and codegen, and gcc front-end
- Standard suite of SSA-based optimizations, support for many targets
- Strongly typed, extensible, vector-aware IR
- Good pipeline, link time opt, jit, easy to work ith and learn
- Well maintained, BSD licensed, great for us to use but not care about
- So: Cg/HLSL/GLSL -> LLVM IR -> Optimizer -> codegen or jit
- Minor problem since it assumes the target CPU has branches
- But really easy to wire up already
Daniel Stone - KDrive futures
- Xorg within spitting distance of being as small as KDrive
- Move generally useful Xorg stuff to DIX, at which point there's not much difference between the two
- Smaller systems mostly running KDrive on framebuffer because everyone else is
- Pure size isn't too much of an issue, compared to FPU/cache locality/etc
Eric Anholt, Adam Jackson, Daniel Stone - Future releases
- 1.4.1: 1st November, 2007. Daniel as RM, nominate on Server14Branch as usual.
- 1.5.0: 1st March, 2008
- Scribbled on the board for 7.4/1.5: XGE, XACE, RandR 1.3 (GPU object), input transformation, pci-rework, XKB 2, _X_EXPORT, DRI memory manager, GLX 1.4, Glucose
- 7.5 features: MPX, lifting DMX up to DIX
Stephane Marchesin, Jerome Glisse - Reverse Engineering
- We have kernel source, but no driver source.
- So we can watch everything the closed driver does and try to figure it out
- Or, that's the theory
- Simple tools: register dumper, BIOS emulator
- Complex tools: libsegfault (unmap the card, trap the fault, record, put back)
- Later done better as valgrind-mmt. But all this is userspace only.
- mmiotrace works on kernel-level accesses too
- renouveau: find the command fifo, feed it GL, watch for changes
- Inspect one thing at a time: single video mode at different refresh rates, eg
- Do lots of dumps so you can identify the commonalities and differences
- (example of delta finding)
- Future work on higher level tools, other architectures, etc.
Eamon Walsh - XACE Demo
- (Demo of protected windows)