Only then are you truly putting a solid boundary between your library and the folks using your library. Everything else is just praying that you and only you have an underscore on your keyboard! :)
And of course another alternative is to accept that there is no true private in Python other than defdef*, so you allow your ShippingOption to be publicly visible while also documenting that the helper-constructors are what should really be used.
*”defdef” as in function definitions inside other function definitions — closures if you will, although I prefer to write mine as taking most if not all their parameters explicitly:
def public(foo):
def private(foo):
…
class Private:
… # less common
…If you have members that users probably shouldn't touch, you prepend them with an underscore. This is just a hint; It doesn't actually change anything. We're all adults here and we know the consequences of reaching into implementation details.
In practice, I found it difficult for coworkers to read and understand so I dropped the idea.
Another limitation I found is that it breaks down when you start using inheritance. For example:
```
class _A: pass
A = NewType("A", _A)
class _B(_A): pass
B = NewType("B", _B)
def foo(a: A) -> None: pass
b = B(_B())
foo(b) # Mypy is not happy: Argument 1 to "foo" has incompatible type "B"; expected "A"
foo(A(b)) # Mypy is OK
```
That way, I can use "normal" naming in `class RealShipOpts:...`, and be explicit that it's not really public for the end users (they should use the `.api` module instead).
It's sad to see that many features regarding object-oriented programming and static typing are implemented worse in Python than Java. Various examples: __str__() vs. toString(); underscore vs. private; @staticmethod/@classmethod vs. static; generic types are so clunky in Python; types are not shown in the official Python standand library documentation; __init__() doesn't force you to call super() whereas it's mandatory in Java; @override (Python 3.12; year 2023) copying Java @Override (JDK 1.5; year 2004) very late; convention changing from duck typing (always available in Python) to structural typing (optional in Python, mandatory in Java).
I wish you were right but, IMHE, it requires a lot of communication once teams grow and many team member do not fully understand the consequences of what they do. It is nice to have something that helps when reviewing code.
> If you have members that users probably shouldn't touch, you prepend them with an underscore
Well, this is precisely what TFA does. It prepends the constructor with an underscore.
That PR might well be rejected. And you have to work with the module owners to get your case supported.
Anything else is not responsible and I would not call it "adult".
> even if you keep all your fields private, the constructor is still, inherently, public.
ShippingOptions and the literals / enums are part of the public API, so the user would just be writing
ShippingOptions(Carrier.USPS, Conveyance.Air)
with no hint that they're doing anything wrong.Dataclasses do have a `kw_only` option, but I'm not sure how well underscore prefixes would be understood as private parameters / a private ctor, whereas wrapping a clearly "private" type should be clear to everybody.
Glyph is not entirely correct on the "any class" bit as you can always break the default init path:
class ShippingOptions:
_ship: Literal["fast", "normal", "slow"]
__init__ = None
def shipFast() -> ShippingOptions:
opts = object.__new__(ShippingOptions)
opts._ship = "fast"
return opts
however that's a pretty ugly pattern, and unlike the one they propose I doubt tooling would understand it.Like the rest of that circle, he moves with the times, supports public shaming of Tim Peters and others and now promotes poorly implemented information hiding so Python ticks a few more boxes for the industry.
Information hiding in a language that allows changing the values of small integers at runtime via ctypes is doomed anyway. And there are plenty of better languages that do it out of the box and in a straightforward manner.
They are basically describing a public API backed by a private type that they can extend, rearrange, or otherwise modify without breaking the public contract.
from typing import *
class _A:
pass
class _B(_A):
pass
A = NewType("A", _A)
B = NewType("B", _B)
def foo[T: (A, B)](val: T) -> T:
return val
a = A(_A())
b = B(_B())
_a = foo(a)
_b = foo(b)
reveal_type(_a)
reveal_type(_b)
Playground here: https://mypy-play.net/?mypy=latest&python=3.12&gist=36573363...My real problem with the evolution of Python is that initially, the language and the community was positioned as anti-Java, anti-big-OOP-like-C++, and then it changed into the thing that it was against, but in a roundabout and suboptimal way. To me, the initial vibe of Python was, "write a 100-line script, don't worry about explicitly documenting types, don't worry about grand architecture, don't worry about creating custom classes, don't worry about encapsulation and public/private". I've been with Python since year 2007 in the 2.x days, and Java since 2002.
Initial examples: Why go through the ceremony of `public static void main(String[] args)` when Python just executes the script line by line at the top level? Oh wait, now you have things like `import` actually executing code instead of simply being a compile-time namespace convenience, and you need weird techniques like `if __name__ == "__main__"`. Why `System.out.println()` when `print()` is so much more concise? But now you're polluting the global namespace, and `print(file=sys.stderr)` isn't that elegant either.
Static typing in Python is the biggest hypocrisy ever. As I understood it, Python scripts were meant to be lightweight and free of the tyranny of enterprise OOP which was epitomized by Java. But people found out that keeping track of types in your head is laborious and error-prone, and getting a compiler to check {that the shape of your objects and function calls match} is a huge productivity boost. And so Python 3 enabled static type hints... which, like I said before, Java had from day zero. To make matters worse, static type hint features were introduced progressively over the years, leading to things getting deprecated from the `typing` module and moved to things like `T|None` and `list[T]` and `collections.abc`.
IIRC the old practice in Python was that you specified some kind of interface in prose or in code (e.g. `class IoStream: def read(); def close()`), but you didn't need to explicitly use that interface as a superclass; you can just duck-type your way around things. But this completely goes against static typing, so I'm pretty sure the new preferred way is to explicitly use abstract superclasses... just like Java did all along (and is mandatory).
I really don't think having top-level (module) variables and functions in Python is a good thing, especially because then they are duplicated as fields and methods in classes. In Java, fields and methods (whether static or instance) can only be placed in classes, and I think this particular straitjacket is a good thing.
> because Python wants these things to be optional
We can both agree that Python gives multiple ways to do things (e.g. no static type hints vs. static type hints). This flies in the face of:
> Readability counts.
> The Zen of Python / There should be one-- and preferably only one --obvious way to do it. -- https://peps.python.org/pep-0020/
Probably the most tragic example is the ways to build up strings in Python: `+` and str(), `%` operator, `str.format()`, f-string.
(To be fair, I have a laundry list of complaints about Java too, such as: .class files and the JVM being an intermediate layer that needs to be understood which is actually different from the Java source language, lack of in-place structs so `new Point[]` is very painful on the memory system, awkward string interpolation/formatting compared to Python's f-strings, very awkward JDBC compared to for example Python sqlite3 API, kinda clunky for web server programming, very awkward JSON handling, enterprisey libraries and APIs that are perfectly documented but are impossible to actually understand.)
[^1]: https://en.wikipedia.org/wiki/Python_(programming_language)
[^2]: https://en.wikipedia.org/wiki/Java_(programming_language)
(Some changes in Python 3 I can recall: bytes/str/unicode being the biggest one; fixing mutable variables in nested functions; changing some obscure behavior in class hierarchies and overload resolution; changing things like range() and map() to lazy evaluation.)
For better or for worse, Java has maintained very good (not perfect) compatibility throughout, even with painful changes like generics in 1.5, lambdas in 8, modules in 9, eventual removal of applets and SecurityManager, etc. This also contrasts with C#/.NET, which I think had some breaking changes over the decades.
typing.Protocol is a good fit for this use case
from typing import Protocol
class HasMessage(Protocol):
def get_message(self) -> str: ...
class A:
"""Implicit (duck-typed)"""
def get_message(self) -> str:
return "A"
class B(HasMessage):
"""Explicit"""
def get_message(self) -> str:
return "B"
class C:
def get_message(self) -> int:
return 1
def print_message(m: HasMessage) -> None:
print(m.get_message())
print_message(A())
print_message(B())
print_message(C()) # fails type checkYes, agreed. I used to work on a large python codebase and tried to add type hints where I could. The issue is that python was not the right tool for the job - except that switching to the right tool was a non-starter. So type hints were the best I could do.
(Earliest mention I could find for camelCase methods was 25 years ago: https://github.com/twisted/twisted/blob/d7c19cd40d07c8c37f85... )
Hell no
Let’s say you’re writing a Python library.
In this library, you have some collection of state that represents “options” or “configuration” for a bunch of operations. Such a set of options is a bundle of potentially ever-increasing complexity. Thus, you will want it to have an extremely minimal compatibility surface, with a very carefully chosen public interface, that is either small, or perhaps nothing at all. Such an object conveys state and might have some private behavior, but all you want consumers to be able to do is build it in very constrained, specific ways, and then pass it along as a parameter to your own APIs.
By way of example, imagine that you’re wrapping a library that handles shipping physical packages.
There are a zillion ways to do it ship a package. There are different carriers who can ship it for you. There’s air freight, and ground freight, and sea freight. There’s overnight shipping. There’s the option to require a signature. There’s package tracking and certified mail. Suffice it to say, lots of stuff.
If you are starting out to implement such a library, you might need an object called something like ShippingOptions that encapsulates some of this. At the core of your library you might have a function like this:
1 2 3 4 5 | |
If you are starting out implementing such a library, you know that you’re going to get the initial implementation of ShippingOptions wrong; or, at the very least, if not “wrong”, then “incomplete”. You should not want to commit to an expansive public API with a ton of different attributes until you really understand the problem domain pretty well.
Yet, ShippingOptions is absolutely vital to the rest of your library. You’ll need to construct it and pass it to various methods like estimateShippingCost and shipPackage. So you’re not going to want a ton of complexity and churn as you evolve it to be more complex.
Worse yet, this object has to hold a ton of state. It’s got attributes, maybe even quite complex internal attributes that relate to different shipping services.
Right now, today, you need to add something so you can have “no rush”, “standard” and “expedited” options. You can’t just put off implementing that indefinitely until you can come up with the perfect shape. What to do?
The tool you want here is the opaque data type design pattern. C is lousy with such things (FILE, pthread_*_t, fd_set, etc). A typedef in a header file can easily achieve this.
But in Python, if you expose a dataclass — or any class, really — even if you keep all your fields private, the constructor is still, inherently, public. You can make it raise an exception or something, but your type checker still won’t help your users; it’ll still look like it’s a normal class.
Luckily, Python typing provides a tool for this: typing.NewType.
Let’s review our requirements:
In order to solve these problems respectively, we will use:
NewType, which gives us our public name...NewType.When we put that all together, it looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
As a snapshot in time, this is not all that interesting; we could have just exposed _RealShipOpts as a public class and saved ourselves some time. The fact that this exposes a constructor that takes a string is not a big deal for the present moment. For an initial quick and dirty implementation, we can just do checks like if options._speed == "fast" in our shipping and estimation code.
However, the main thing we are doing here is preserving our flexibility to evolve the related APIs into the future, so let’s see how we might do that. For example, let’s allow the shipping options to contain a concrete and specific carrier and freight method:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
As a NewType, our public ShippingOptions type doesn’t have a constructor. Since _RealShipOpts is private, and all its attributes are private, we can completely remove the old versions.
Anything within our shipping library can still access the private variables on ShippingOptions; as a NewType, it’s the same type as its base at runtime, so it presents minimal1 overhead.
Clients outside our shipping library can still call all of our public constructors: shipFast, shipNormal, and shipSlow all still work with the same (as far as calling code knows) signature and behavior.
If you need to build and convey some state within your public API, while avoiding breakages associated with compatibility churn, hopefully this technique can help you do that!
Thanks for reading, and thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor.
NewType is to call it like a function, as I’ve done in these examples, but if you are wanting to use this pattern inside of a hot loop, you can use # type: ignore[return-value] comments to avoid that small cost. ↩