This is a story about how I came upon an idea for making a library much easier to use, entirely by accident. I write a lot of things in Haskell, but sometimes there just aren’t the right libraries yet for the kind of thing I want to do. One of those times involved some user task automation: basically simulating the keyboard and mouse to perform repetitive tasks by taking over the user’s control of the computer. I’d used PyAutoGUI for this kind of thing before, but there’s no simple binding to the APIs it’s using (I believe PyAutoGUI does different things on each of Windows, Mac OS, and Linux, so it would be a lot of work to replicate all that). Instead, I landed on the idea for my hsautogui project, writing bindings to the existing Python AutoGUI code.
It’s not too bad to have to talk to different languages from Haskell, but generally you have to go over the C FFI to get things working. Luckily for me, someone had already written Haskell bindings for the CPython API, which is a huge chunk of work I didn’t have to do, and am grateful for. Unluckily, it looked like the last time they had been touched was five years ago, for Python 3.4, and I wanted to use 3.9. In addition, because the bindings were so low-level—grabbing pointers to Python objects, losing any kind of type information because Python doesn’t really roll that way, etc.—it was a lot of work to write individual AutoGUI functions that I wanted in Haskell, and most of that work was repetitive wrapping/unwrapping boilerplate.
So, I did what any sane person would do, and decided to write a library in between the library I was writing and the library I was using, so I could use that library instead of the previous more-complicated library to write my library.
While it was tempting to think like a library writer about the details of how things would work, instead I started out from the point of view of a user, as the writer of hsautogui. What would make my life the easiest? Who cares how much work the library writer has to do to support it—I’ll do that later.
What I wanted the most was to write the bare minimum wrapper over Python functions, turning them into Haskell functions. Something like this:
myFunction :: Arg1 -> Arg2 -> Something
=
myFunction arg1 arg2 "function" arg1 arg2 call
However, there are obviously problems with that. How will
call
know how many arguments it takes? I didn’t want to go
too far into type-level magic, so changed the args to a list. How does
Python know to call functions from within certain modules? We can add
that argument as well. If users are seeing too much repetition there,
currying makes it easy enough to write
call' = callModule "pythonModuleName"
if we keep it as the
first argument.
myFunction :: Arg1 -> Arg2 -> Something
=
myFunction arg1 arg2 "module" "function" [arg1, arg2] call
Also, Python has a nice system for keyword arguments. The natural
type for these is probably a Map Text Arg
, but it turned
out that Map
s were a bit too cumbersome to actually work
with for such simple wrapper functions. So let’s go with the classic: a
list of pairs.
myFunction :: Arg1 -> Arg2 -> KeywordArg -> Something
=
myFunction arg1 arg2 arg3 "module" "function" [arg1, arg2] [("arg3Name", arg3)] call
This now looks pretty much exactly how call
’s type
signature ended up:
call :: FromPy a
=> Text -- module name
-> Text -- function name
-> [Arg] -- arguments
-> [(Text, Arg)] -- keyword arguments
-> IO a
We’ll examine that Arg
type in a bit.
Another thing I wanted, as a user of my own upcoming library, was for
it to be dead simple to shuttle data back and forth between Haskell and
Python, while preserving as much of the Haskell types as I could. The
simplicity was the key, though. From this idea we’re pretty quickly led
to the idea of using To
and From
instances, so
we can have e.g. 7
be an instance of FromPy
and ToPy
, and then use it as an argument without caring
about how it’s getting converted. SomeObject
is how
haskell-cpython
represents, well…some Python object. And
that’s about all we know about these objects from looking at them.
class FromPy a where
fromPy :: Py.SomeObject -> IO a
class ToPy a where
toPy :: a -> IO Py.SomeObject
The instances for even super-simple types can get kind of out of hand. It’s nice that we no longer have to manually write them, as library users.
instance FromPy Bool where
= do
fromPy pyB <- Py.isTrue pyB
isTrue <- Py.isFalse pyB
isFalse case (isTrue, isFalse) of
True, False) -> pure True
(False, True) -> pure False
(False, False) -> throwIO . PyCastException . show $ typeRep (Proxy :: Proxy Bool)
(True, True) -> throwIO . PyCastException $ (show $ typeRep (Proxy :: Proxy Bool)) ++
(". Python object was True and False at the same time. Should be impossible."
These instances make the notion of “things that can be converted
to/from Python” explicit. You may have seen this to/from instance
pattern in other places, like Aeson’s ToJSON
and
FromJSON
. It’s convenient to have typeclasses handle the
marshalling of data behind the scenes, so we don’t have to think about
it nearly as much. Just call toPy
or
fromPy
.
One thing we might notice is that Python essentially takes a list of
arguments (myFunc(7, "hello", True)
), but unlike Haskell
lists, these arguments don’t have to have the same type. In fact,
usually they won’t all have the same type.
A quick way to fix this is with existential types. Let’s create a
type Arg
that represents an argument to a Python
function.
data Arg = forall a. ToPy a => Arg a
Of course, we want Arg
itself to be an instance of
ToPy
, so we can grab the ToPy
-able thing
inside it, as Python.
instance ToPy Arg where
Arg a) = toPy a toPy (
Because of the generality of a
, all we know about what’s
in a [Arg]
is that each element has a ToPy
constraint. This means just about the only thing we can do with such a
list’s elements is call toPy
on them.
With this handy, it’s now possible to build heterogeneous lists of
things that have ToPy
instances. Nifty.
sampleArgs :: [Arg]
=
sampleArgs Arg (7 :: Integer)
[ Arg ("hello" :: Text)
, Arg (True :: Bool)
, ]
Besides calling functions in a module, you might also want to set some attribute in them, or read some attribute from them.
setAttribute :: ToPy a
=> Text -- ^ module name
-> Text -- ^ attribute name
-> a -- ^ value to set attribute to
-> IO ()
getAttribute :: FromPy a
=> Text -- ^ module name
-> Text -- ^ attribute name
-> IO a
What happens if, say, we try to getAttribute
a
Text
but Python gives us back an int
rather
than some bit of unicode
?
Well, here’s the magic:
easyFromPy :: (Py.Concrete p, Typeable h)
=> (p -> IO h) -- ^ python from- conversion, e.g. Py.fromFloat
-> Proxy h -- ^ proxy for the type being converted to
-> Py.SomeObject -- ^ python object to cast from
-> IO h -- ^ Haskell value
= do
easyFromPy convert typename obj <- Py.cast obj
casted case casted of
Nothing -> throwIO $ PyCastException (show $ typeRep typename)
Just x -> convert x
Now to explain. p
and h
are mnemonics for a
Python object and a Haskell value, respectively. The Python object has
to be Concrete
. Basically, all the primitive types
(numbers, strings, etc.) we want to eventually cast to are going to be
Concrete
, but you can read the source for more details
there. The Haskell value has to be Typeable
, which gives us
the ability to get the name of the type of the value using
typeRep
. This type information is passed along via
Proxy
. For example,
typeRep (Proxy :: Proxy Integer)
gives us
"Integer"
. It’s a nice way to chuck the name of a type into
an error message, but it does require the user to give us a
Proxy
carrying the right type along. Usually, that’s pretty
easy, since they’ll be calling easyFromPy
in the context of
writing a FromPy
instance which knows which type it’s for.
For example:
instance FromPy Double where
= easyFromPy Py.fromFloat Proxy fromPy
Here, the compiler can infer that our Proxy
is carrying
along a Double
. Using typeRep
, we can throw an
exception with information about which type we failed to cast to.
What’s the difference between randint
and
uniform
here?
randint :: Integer -> Integer -> IO Integer
=
randint low high "random" "randint" [arg low, arg high] []
call
uniform :: Integer -> Integer -> IO Double
=
uniform low high "random" "uniform" [arg low, arg high] [] call
Besides calling different Python functions, they also return
different types. Because FromPy
is working behind the
scenes here, we don’t even really need to think about this while writing
a wrapper library. Because our type signatures are handily nearby,
call
knows which type to try to coax the
SomeObject
python returns into being. We can also use the
TypeApplications
language extension to explicitly tell
call
what type to marshal the value it gets into, if
needed.
@Double "random" "uniform" [arg low, arg high] [] call
There’s quite a lot of consideration that goes into making something
so simple! The API surface we ended up with is merely call
to call Python functions, and
getAttribute
/setAttribute
to get and set
attributes. ToPy
and FromPy
instances handle
type marshalling for us, and we use Arg
to import
heterogeneous argument lists from Python into Haskell land.
I guess it’s time to get back to improving hsautogui, now
that it’s running on CPython.Simple
.